Ad-Hoc Analysis
Analysis or report created on the spot to analyze the specific business questions that are not clearly answered in a pre-built report.
Ad Hoc Analysis is performed on an existing data warehouse or transactional database and results can be in form of direct output of query or data visualization or static report or a combination of all of them based on business need.
Alternative Data
Amazon Aurora
Amazon Aurora is Relational DataBase Service(RDBS) available at Amazon Web Services (AWS).
- This database is high in performance and also fault-tolerant with inbuilt read replica for high read scalability.
- Compibility with MySQL and PostgreSQL both.
- Aurora is designed to be automatically scales storage up to 64 terabytes.
- Available Integrations: AWS services, such as Amazon CloudWatch, Amazon RDS, and AWS Identity and Access Management (IAM).
Amazon Redshift
Amazon Redshift is a fully managed, highly scalable(up to petabytes) enterprise data warehouse service provided by Amazon Web Services (AWS). Users can easily set up, operate, and scale a data warehouse in the cloud. Columnar storage & advanced compression techniques provide high performance and scalability for data warehousing and business intelligence workloads.
Azure Data Factory
Azure Synapse
Bar Chart
The bars can be plotted vertically or horizontally; along with clusters or stacked pattern(each bar represent one category(the total value of that category) and the sections within that bar represent subcategories).
Business Intelligence
EXPLORE
Business intelligence Developer
A Business Intelligence (BI) Developer is a professional who is responsible for designing, developing, and maintaining BI systems and solutions. They use a variety of tools and technologies to extract, transform, and load (ETL) data from different sources, and then use that data to create reports, dashboards to make data-driven decisions.
CI / CD Integration
CI/CD, or Continuous Integration and Continuous Delivery, is a software development practice that enhances efficiency and reliability in the software development and release process. Continuous Integration involves frequent code merging into a central repository, triggering automated build and testing procedures within minutes of code changes. Continuous Delivery extends this by automating the entire release process, including infrastructure provisioning and deployment.
Key elements of a CI/CD pipeline include stages like source code integration, building runnable software instances, running automated tests, and deploying to various environments. Failures at any stage trigger notifications to the responsible developers. CI/CD pipelines ensure fast, reliable, and accurate software delivery, with a focus on speed, reproducibility, and automation. They also promote a culture of collaboration, enabling developers to focus on coding, ensuring code quality, and facilitating easy access to the latest versions. Overall, CI/CD streamlines development enhances product quality, and accelerates software delivery.
Cloud Analysis
Cognitive Search
Cognitive Search revolutionizes enterprise search by using machine learning and natural language processing to deliver highly personalized and accurate results. Unlike traditional keyword-based searches, it understands user intent and context, enhancing information discovery.
Businesses benefit from Cognitive Search in several ways. It empowers self-service solutions, making it easier for customers and employees to find information across multiple sources. Intelligent chat-bots improve user interactions, while in customer service, it enhances contact center quality through knowledge consolidation and automation.
The future of Cognitive Search includes voice and visual search integration. Choosing the right Cognitive Search engine involves considering factors like architecture, scalability, connectivity, intelligence, security, content processing, customization, and insights to align with specific needs and technology stacks.
In summary, Cognitive Search is an AI-driven search solution that elevates information retrieval, making it more efficient and tailored for businesses.
Dashboard
Dashboard acts as a hub for monitoring and analyzing organizational performance and can be accessed through web browsers or mobile devices.
Data Governance
Data Lake
Data Mesh
Data Strategy
It encompasses the policies, procedures, and technologies that are needed to manage and leverage data effectively. A well-defined data strategy is essential for organizations to make informed decisions, drive innovation, and enhance their competitive edge.
Data Warehouse
Data integrated into a data warehouse from transactional systems (e.g. ERP, CRM) relational databases, and other data sources, typically on a regular cadence.
Data is stored in hierarchical dimensions and table format to Data Warehouse.
Data Warehouse Automation
Data Warehouse Automation (DWA) is a modern approach to streamline and accelerate data warehouse development & management by automating various tasks in the data warehousing process, such as designing the data warehouse, generating code for data extraction, transformation, and loading (ETL), deploying code to servers, executing ETL processes, and monitoring and reporting on batch executions. Data Warehouse Automation results in faster development, adaptability to changing business needs, focus on reporting and analytics rather than ETL code, ensuring data quality, and maintaining consistency in code and naming standards. DWA tools are primarily used by professional Data Warehouse developers and can empower technical data analysts to maintain their own Data Warehouses with proper training and support. DWA tools can be deployed on-premise servers, or in the cloud.
DAX
Dimensions
Dimensions in the context of data warehousing and business intelligence are collections of reference information that provide essential context and categorization for measurable events or facts. These dimensions enable a comprehensive understanding and analysis of these facts by offering background details and structure, such as product information, customer profiles, geographic territories, and temporal data. Dimensions serve as foundational elements in data models, facilitating efficient historical data analysis and enabling meaningful answers to business queries. They are fundamental for organizing and making sense of a set of related facts within a data warehouse or BI system.
Dimensions Tables
In other words; it is a table of star or snowflake schema of a data warehouse.
Doughnut Chart
A doughnut chart is an extended variant of a PI chart that features a circular shape with a hole in the center. Similar to a PI chart it is used to visualize data in a way that highlights the proportion of different categories. The outer ring represents the total value, while the segments within the ring represent individual categories and their respective values.
Data Cleansing
Data Cleansing term is also referred as data cleaning or data scrubbing, Process to correct inaccuracies, errors, missing entries, or inconsistences in data is called Data Cleansing that ensure that data is accurate and reliable for analysis any other relevant usage.
Data wrangling
Subset of ETL process mainly to clean, transform, and prepare row data to be used for analysis purposes.Example: Sales transactions have PIN codes available but the region/place referring to that PIN code is missing is inconsistent. The process of curating this data is called data wrangling.
ELT
ETL
Fact Table
Factless Fact Tables
Hierarchical Dimensions
Simple example of possible hierarchy widely adopted by by all dimensional data warehouse is date dimension : Year > Quarter > Month > Week > Day
Incremental Load
Key Performance Indicators
Line Chart
To represent the data/information as a series of data points connected by straight line segments.
Measures
In the realm of Business Intelligence, a measure refers to a fundamental data point that encapsulates a specific aspect of a business process or entity. These measures can take the form of qualitative information, like a customer’s name, or quantitative data, such as monthly revenue figures. Measures are pivotal in data modeling, enabling calculations and summarizations for analytical purposes. Qualitative measures are typically associated with dimensions, which provide context and additional details to enrich the interpretation of the measured data. By leveraging measures and dimensions, businesses can make more informed decisions and gain valuable insights in their BI endeavors.
The measures in a fact table can be; – Additive, Semi-additive, Non-additive
Metadata
MLOAP
OLAP
OLTP
OnLine Transaction Processing system refers to the large transaction-oriented applications for rapid and accurate data processing.
Power BI
Pentaho
Qlik View
Real-Time Data Integration
Relational Database management System
A computer software application that stores data in tables and those tables are specific entities, within table data is stored in columns and rows. As it is a Relational database, one table is connected to another table through keys. The majority of software applications or programs use RDBMS as their back-end to store transactional data.
Schema
Semantic Layer
A semantic layer is a data abstraction that translates complex data structures into user-friendly terms. It is created by individuals who understand both data storage and business reporting needs. They rename raw data fields into intuitive business terms, hide unnecessary fields, and provide pre-defined filters for common queries. This layer allows users to access data through a simplified interface, organize it into folders, and run reports without needing extensive knowledge of data storage or query languages. Users can also customize the presentation of data for their reporting requirements. The semantic layer streamlines access to information, reducing dependence on developers and fostering a common language for report creation. Benefits include collaborative prototyping, query uniformity, report history independent of data store changes, and enhanced cross-departmental collaboration. In summary, a semantic layer simplifies data access and reporting, making it more user-friendly and efficient for businesses.
Server-less Integration
Server-less Data Integration is a cloud computing model that frees developers from managing server infrastructure. It allows them to focus solely on creating applications, enhancing productivity. This approach streamlines app development by eliminating the need to provision and maintain infrastructure manually. Server-less Data Integration is vital for resolving data quality issues caused by data silos in organizations. It replaces traditional ETL processes, offering agility and cost savings. Key capabilities include automatic infrastructure setup and event-driven execution. In essence, it leverages server-less architecture, triggered by events like HTTP requests, enabling efficient data integration without the need for dedicated servers.
Snowflake Schema
SQL Server Analysis Services - SSAS
SQL Server Analysis Services (SSAS) is a component of Microsoft SQL Server that enables organizations to analyze large datasets by organizing them into easily searchable cubes. It offers multidimensional and tabular capabilities for optimized querying and data mining. Multidimensional OLAP (MOLAP) uses optimized storage for fast queries, while tabular mode compresses data in-memory for even faster performance. SSAS allows users to analyze data from various angles, facilitating historical and trend analysis. OLAP cubes are essential components of data warehouses, providing quick insights and the ability to slice, dice, and solve problems.