Log and Telemetry Analytics Performance Benchmark

Executive Summary
Log Analytics Platforms Tested
Field Test Setup
Field Test Results
Conclusion
Disclaimer
Appendix
About Microsoft
About William McKnight
About Jake Dolezal
About GigaOm
Copyright

1. Executive Summary

The number of connected devices, including the machines, sensors, and cameras that make up the Internet of Things (IoT), continues to grow rapidly. By 2025, we can expect tens of billions of Internet of Things (IoT) devices to produce zettabytes of data, even as machine-generated data outstrips that generated by humans. At the same time, data-driven organizations are generating expanding volumes of log-like data and relying on analytic databases to load, store, and analyze volumes of log data at high speed to derive insights and take timely actions.

This report focuses on the performance of cloud-enabled, enterprise-ready, popular log analytical platforms Microsoft Azure Data Explorer (part of Azure Synapse Analytics), Google BigQuery, and Snowflake. Due to cost limitations with Elasticsearch and AWS OpenSearch, we could not run our tests on Elasticsearch. Microsoft invited GigaOm to measure the performance of the Azure Data Explorer engine and compare it with its leading competitors in the log analytics space. The tests we designed intend to simulate a set of basic scenarios to answer fundamental business questions that an organization from nearly any industry might encounter in their log analytics.

In this report, we tested complex workloads with a volume of 100TB of data and concurrency of 1 and 50 concurrent users. The testing was conducted using comparable hardware configurations on Microsoft Azure and Google Cloud.

Of course, testing platforms across cloud vendors is challenging. Configurations can favor one cloud vendor over another in feature availability, virtual machine processor generations, memory amounts, optimal input/output storage configurations, network latencies, software and operating system versions, and the testing workload. Our testing demonstrates a slice of potential configurations and workloads.

As the sponsor of the report, Microsoft selected the specific configuration of its platform that it wanted to test. GigaOm then selected the configuration for Snowflake closest to the ADX configuration in terms of CPU and memory. Google BigQuery is offered “as-is” with no configuration choices other than slot commitments (discussion provided).

We leave the test’s fairness for the reader to determine. We strongly encourage you to look past marketing messages and discern its value for yourself. We hope this report is informative and helpful in uncovering some of the challenges and nuances involved in platform selection.

The parameters to replicate this test are provided in this document. We developed the testing suite from scratch because no existing benchmark framework met our needs. The testing application and the queries used are part of a code repository on GitHub.

Overall, the test results were insightful and revealed the query execution performance of the three platforms tested. Some of the highlights include:

Azure Data Explorer (ADX) outperformed Google BigQuery and Snowflake on all 19 test queries with a single user and 18 of 19 with 50 concurrent users.
ADX completed all 19 queries under 1 second with a single user, while the average execution time on BigQuery and Snowflake was 15 and 13 seconds per query, respectively¹.
ADX completed all 19 queries in under 10 seconds with 50 concurrent users.
BigQuery and Snowflake both had 8 queries (different ones) that did not complete within the 2-minute timeout².
We found the cost of data ingestion and query execution on ADX to be significantly lower than on Google BigQuery, Snowflake, and Elasticsearch/OpenSearch
We found it infeasible to test Elasticsearch and OpenSearch with this workload due to slow data loading speeds, expansion of the on-disk data volume (as compared to compression), and the inability to scale hot-tier (SSD) storage independently or in a cost-effective manner. The parameters to replicate this test are provided. You are encouraged to compile your own representative queries, data sets, sizes, and test compatible configurations applicable to your requirements.
1. There were several performance outliers. The geometric mean of query execution for BigQuery and Snowflake was approximately 4.5 seconds per query.
2. Using the 50^th percentile or the 25^th query to complete (or be canceled)

2. Log Analytics Platforms Tested

For this field test, we analyzed the performance of three popular platforms often used to store and analyze machine-generated data. Note that these platforms are not just for storing, processing, and analyzing log data, but can also be used for other analytical purposes.

Azure Data Explorer (ADX)

Microsoft Azure Data Explorer (ADX) is a fully-managed, cloud-analytics platform created as a unique combination of three common database technologies: column store, text indexing, and data sharding. It was launched in 2018 and is used for ingesting and querying massive telemetry data streams in scenarios of logs and metrics monitoring and diagnostics, machine learning, and time series analytics. ADX natively supports the Kusto Query language and TSQL. Unlike BigQuery BI Engine and Snowflake Search Optimization features, ADX does not charge extra for query accelerators and optimizers. ADX has separation between compute and storage that allows the creation of many clusters over the same dataset and also provides predictive auto-scale, which optimizes cost.

BigQuery

Google BigQuery is a fully-managed, serverless, data-warehouse technology, exclusively offered in Google Cloud, that enables scalable analysis over large datasets. It is a Software as a Service (SaaS) that supports querying using ANSI SQL. It also has some built-in machine-learning functions. It was first offered in 2011 and is based on Dremel, a distributed system developed by Google for interactively querying large datasets.

BigQuery provides compute through “slots.” While you can commit to a certain number of slots for a month or a year for a discount, you can also buy Flex Slots at $4 per 100 slots per hour. There is a separate charge for BigQuery Active Storage.

Additionally, BigQuery has an in-memory analysis service called BI Engine. By using BI Engine, you can analyze data stored in BigQuery with faster query response times for certain types of queries. The BI Engine design automatically tunes queries by moving data between the in-memory storage, so no tuning was required for our queries to utilize the service. At the time of testing, we purchased 100GB of in-memory storage for $30.36 per GB/month (or $0.0416 per GB/hour). The BigQuery optimizer automatically managed the BI Engine and which queries utilize the service. With a maximum of 100GB, it is likely that BI Engine only made an impact on queries where small time ranges were filtered.

Snowflake

As a cloud-only, fully managed solution, Snowflake has a clear separation between compute and storage. For Snowflake on Azure, which is what we used for the queries, data is stored in Azure Storage and is cached when queries are executed to bring the data in closer proximity to compute resources. Snowflake essentially offers two configuration “levers”—the size of the warehouse cluster and how many clusters are permitted to spin up to handle concurrency. Snowflake scales by cluster server count in powers of two (1, 2, 4, 8, 16, and so on). If enabled, Snowflake will spin up additional clusters to handle multi-user concurrent query workloads. Snowflake would automatically spin the additional clusters down once demand has passed. If not enabled, it will place paused queries in a queue until resources free up.

For Snowflake, you pay a flat hourly fee when hourly compute resources are being used. We paid $3.00 per hour for the Enterprise tier. Once the compute warehouse goes inactive, you no longer pay. However, there is a separate charge for data storage.

Elasticsearch/OpenSearch

Elasticsearch is a distributed, open search and analytics engine for many types of data, including textual, numerical, geospatial, structured, and unstructured. Elasticsearch was built on the Apache Lucene project and was first released in 2010 by Elastic. Elasticsearch is the “E” component of the ELK Stack, a set of tools for data ingestion, enrichment, storage, analysis, and visualization. (The “L” and “K” of the ELK Stack are Logstash and Kibana, respectively.) Today, the Elastic Stack also includes a rich collection of lightweight shipping agents known as Beats for sending data to Elasticsearch. AWS OpenSearch is Amazon’s fully-managed implementation of the open-source version of Elasticsearch, called OpenSearch. In addition to the above Elasticsearch features, OpenSearch can also be loaded with Amazon Kinesis Firehose, which we attempted for this test.

We wanted to run our field test on a fully-managed Elasticsearch service because this workload seems like a good fit for the platform. We investigated Elastic Cloud and AWS OpenSearch but encountered difficulties that made running the test infeasible. We will fully disclose the reasons for this in the Results section.

3. Field Test Setup

GigaOm Log Analytics Field Test

The GigaOm Log Analytical Field Test is a workload created to simulate the activities found in complex, machine-generated log environments. The data schema, population, and implementation have been designed to be broadly representative of modern use cases. In this case, the implementation is the instrumented activity of a cloud service processing input files from different sources using different components. The testing exercises typical analytical needs over this dataset. The data generator and set of queries used for this test are located in this repository.

Test Environments

Selecting and sizing the compute and storage for comparison can be challenging, particularly across different cloud vendors’ offerings.

Results may vary across different configurations, and again, you are encouraged to compile your own representative queries, data sets, data sizes, and test-compatible configurations applicable to your requirements. All told, our testing included three different database environments, as shown in Table 1.

Table 1. Configurations Used for Tests *

Platform	ADX	BigQuery	Snowflake
Cloud	Azure	Google	Azure
Version/Release	1.05230.12040	May 23, 2022	6.17.0
Region	East US 2	US	East US 2
Node Count	9	2,000	16
Instance Type	L16as_v3	Regular Slots	X-Large
Cost Per Hour	$32	$55	$48
Source: GigaOm 2022

* Note: The native JSON type preview in BigQuery was released after our testing.

When conducting field tests like this, we attempt to balance the costs, hardware, and software as much as possible. However, in fully-managed cloud services with very different architectures, this is difficult.

Field Test Data

The testing simulates a workload of machine-generated log files. The log files are comma-separated values generated using a custom application built for this test. The data generator was written in C#. It can be used to generate data sizes of 1GB and 1TB for quality assurance purposes and 100TB for the actual production test. The generator also uploads the data to an Azure Storage Blob. The program generates random, human-readable data typically found in system logs. The schema of the data set and the respective data types used for ADX, BigQuery (BQ), and Snowflake (SNOW) are as shown in Table 2.

Table 2. Field Test Dataset Schema

Field Name	Brief Description	ADX	BigQuery	Snowflake
Timestamp	The date and time the log event occurred	Datetime	Timestamp	Timestamp
Source	What triggered the log event	String	String	String
Node	The name of the node where the log event originated	String	String	String
Level	The level of severity of the event	String	String	String
Component	The application that threw the log message	String	String	String
ClientRequestId	The identifier making the request that triggered the event	String	String	String
Message	The full text of the log message	String	String	String
Properties	JSON-encoded readings of certain system components (Optional)	Dynamic	Struct	Object
Source: GigaOm 2022

The volume of the uncompressed total dataset is 100TB, but when compressed with Gzip, it is roughly 12TB on disk. The data is time-based and covers a span of nearly three months. The total record count is over 404 billion rows. The dataset was designed to simulate a very high volume of log activity with approximately between 250 and 300 million rows represented for every hour in our Timestamp field.

Loading the Dataset

Loading the data into respective cluster environments was straightforward for ADX, BigQuery, and Snowflake. First, we will describe how we loaded ADX and BigQuery. Note that our 100TB dataset was spread equally across ten Azure Blob Storage containers, which permitted some parallelization of the loading (and allowed us to circumvent our Azure Blob Storage per-container storage quotas).

ADX
At the recommendation of Microsoft, we used a program called LightIngest to load ADX. LightIngest is a freely-available Windows application. We used it to ingest the data directly from our Azure Blob Storage containers. For ADX we used the default configuration so there was no need to specify partitioning or clustering or any other config option.

It took LightIngest approximately four hours to load the entire dataset.

BigQuery
To load BigQuery, we first transferred the data from our Blob Storage to Google Cloud Storage buckets using Google Cloud Storage Transfer service. This process took a couple of hours. Once the data was transferred and verified, we used the GCloud BigQuery command line utility to load the data into BigQuery. Here is a sample of the command we used to load BigQuery:

    bq --project_id=our_project_id --location=US load --field_delimiter "," --source_format=CSV --allow_quoted_newlines --time_partitioning_field Timestamp --time_partitioning_type HOUR --clustering_fields ClientRequestId,Source,Level,Component logbenchmark100tb.logs gs://our-bucket/* logs.json

There are a number of configuration details to note in this statement. First, we followed Google’s documentation on partitioning and partitioned the dataset on the Timestamp column. We created a new partition for every hour. Then, we followed Google’s documentation on clustering and clustered the data within those partitions using Client Request ID, Source, Level, and Component (in that order). The JSON config file delineated the schema of the data at the end of the statement (and those specifics are detailed above in Table 2).

BigQuery does not permit CSV data to be loaded directly into the Struct column of Properties. Thus, after the Properties data was loaded in raw form to a String column, we updated the schema of our Logs table, rather than adding a Struct column and an UPDATE to parse the raw JSON data using the JSON_EXTRACT_SCALAR() function to populate the new Struct column. This turned out to have similar performance to a Struct column and semantically more correct since the dynamic column in ADX is both dynamic and schemeless, while the Struct is schematized JSON that requires a priori definition of the column. So, we modified the test queries to use JSON query functions over that column.

All told, it took the GCloud BigQuery command-line utility about ten hours to load, partition, and cluster the data.

Snowflake
To load Snowflake, we created a table clustered by Timestamp, Level, Source, Node, and Component. We loaded the data with the COPY INTO command to load the data directly from Azure Blob Storage. We also parsed the JSON in the Properties column into a Snowflake OBJECT. Finally, we added Search Optimization to the table and waited for the search table to be completely built in the background. It took Snowflake about 22 hours to load and cluster the 100TB dataset.

Field Test Execution

We executed the log analytics field test using a custom-written Python script that connected to each platform, executed the test queries and measured the time elapsed. The script is capable of both single-user tests and concurrent-user tests.

For the single-user tests, each query was executed serially 20 times in a row, and the median and 95^th percentile values of query execution times are reported in the next section.

For the concurrent-user tests, 50 user threads were created and each submitted the same query simultaneously. The Python driver waited for all 50 users’ queries to complete before moving on to the next query. If the two-minute timeout was reached, the query (or queries) were canceled. The median and 95^th percentile values of query execution times are reported in the next section.

4. Field Test Results

This section analyzes the results of our log benchmark and the queries we executed to return these results. First, a disclaimer: We made a best effort to translate the queries between Azure Data Explorer’s Kusto Query Language (KQL), Google BigQuery’s Standard SQL, and SnowSQL.

Tables 3 shows the average execution time with 50 concurrent users over a single run. Also, note that the values in this table are based on all query execution, including timeouts (marked with an asterisk). In cases with extended timeouts, we set the execution time ceiling to 120 seconds. The query syntax can be found in the appendix.

Table 3. 50 Concurrent Users Average Execution Time Over a Single Run

Query	Azure Data Explorer	Google BigQuery	Snowflake
1	5.78	22.81	50.84
2	7.53	120.00 *	120.00 *
3	0.38	14.68	111.40 *
4	0.20	46.55	86.11 *
5	3.57	13.54	27.24
6	2.90	120.00 *	120.00 *
7	1.07	7.31	8.38
8	2.12	6.04	11.88
9	2.60	8.75	15.37
10	3.65	120.00 *	120.00 *
11	8.57	35.40	37.21
12	2.51	5.53	14.41
13	0.56	37.10	23.97
14	1.53	1.44	5.19
15	2.27	120.00 *	120.00 *
16	1.27	120.00 *	22.60
17	2.54	120.00 *	119.63 *
18	2.07	120.00 *	115.76 *
19	3.38	120.00 *	108.50 *
Sum of Averages	54.52	1,159.15	1,238.47
Source: GigaOm 2022

Figure 1 shows the total time on task for all queries, based on the average for each query.

Figure 1. Sum of Average Query Runs, All 19 Queries, 50 Concurrent Users

In Table 4 we see the results of our query tests based on a single user, with the execution time calculated over three consecutive runs. Again, the values in this table are based on all query execution, including timeouts (marked with an asterisk). In cases with extended timeouts, we set the execution time ceiling to 120 seconds. The query syntax can be found in the appendix.

Table 4. Single User Average Execution Time Over 3 Consecutive Runs

Query	Azure Data Explorer	Google BigQuery	Snowflake
1	0.23	2.06	6.20
2	0.38	40.78	115.63 *
3	0.06	1.75	9.55
4	0.05	5.66	4.11
5	0.13	2.22	4.94
6	0.14	9.35	35.53
7	0.10	1.83	1.83
8	0.13	0.71	1.69
9	0.14	0.92	3.64
10	0.20	9.02	19.81
11	0.45	1.85	3.22
12	0.18	1.11	1.37
13	0.11	4.44	2.32
14	0.17	0.60	1.15
15	0.15	20.31	37.56
16	0.13	120.00 *	7.11
17	0.13	28.65	11.69
18	0.12	32.15	11.09
19	0.16	25.62	8.80
Sum of Averages	3.15	309.01	287.24
Source: GigaOm 2022

Finally, Figure 2 shows the total time on task for all single-user queries, based on the average for each query.

Figure 2: Sum of Average Query Runs, All 19 Queries, 1 User

Issues with Elasticsearch/OpenSearch

The architecture of Elasticsearch made loading 100TB of data in a hot tier very expensive. Elastic Cloud has a number of storage tiers (hot, warm, cold, and frozen), and AWS OpenSearch has instance storage (locally attached SSDs and Elastic Band Storage volumes), UltraWarm (separate instance), and cold storage (S3). According to AWS, the minimum storage requirement for OpenSearch is:

Minimum Storage = Source Data * (1 + Number of Replicas) * 1.45

This means that a single replica (not recommended for disaster recovery) would require a minimum of 145TB of storage for our tests. ADX, BigQuery, and Snowflake all compress data when loading. Elasticsearch actually expands the storage footprint.

Unfortunately, Elastic Cloud does not allow compute and hot storage to be scaled independently because the hot tier depends on locally-attached SSDs, which are physically attached to the cloud instance. AWS OpenSearch can scale Elastic Band Storage (EBS) independently, but this is not as performant as locally-attached SSDs, which cannot be scaled. Only locally-attached SSDs provide the highest performance and I/O throughput we need for our tests.

For Elasticsearch, to achieve the minimum storage requirement of 145TB in the hot tier, we would need an Azure (Virginia) storage-optimized deployment across three availability zones with a total of 147TB, 4TB of RAM, and 575 vCPU. This deployment costs $170 per hour to run. For AWS OpenSearch, we would need 40 nodes of r6gd.16xlarge.search (each instance has 3.8TB of local SSDs) to achieve 145TB of hot tier storage, costing $244 per hour. We had the option of a smaller cluster (35TB over nine nodes) to keep the most queried data on local SSD storage and put the rest on EBS (110TB). This would bring the compute cost down to $55 per hour, but the EBS storage would cost approximately $21 per hour.

On the surface, this may seem feasible for our testing, which has a run-time of no more than 38 minutes per run (19 queries x 2-minute timeout). However, the loading time on Elasticsearch makes the costs very high. Elasticsearch does not have a high-speed, bulk data loading mechanism like ADX, BigQuery, and Snowflake. To load and cluster the 100TB dataset on BigQuery and Snowflake took less than a day. On Elastic Cloud, we used 80 instances of Logstash (which have a non-trivial hourly cost), and we could only achieve an indexing rate of 500GB per hour, which would have taken approximately eight days to complete. This drove the costs of our test to well over $30,000.

For AWS OpenSearch, we attempted using Kinesis Firehose, which AWS recommends for loading large amounts of data. This achieved a faster indexing rate than Logstash, but the costs of running multiple Kinesis agents plus the cost of our OpenSearch cluster, brought the total to nearly $10,000. The costs of loading the data and running these tests on Snowflake and BigQuery were well under $5,000. The costs for loading the data and running these tests on ADX was less than $500.

5. Conclusion

This report outlines the results from a GigaOm Log Analytics Field Test to compare Azure Data Explorer (ADX), Google BigQuery, and Snowflake. Overall, they were insightful in revealing the query execution performance of the three platforms tested. Some of the test highlights include:

Azure Data Explorer (ADX) outperformed Google BigQuery and Snowflake on all 19 test queries with a single user and 18 of 19 with 50 concurrent users.
ADX completed all 19 queries under one second with a single user, while the average execution time on BigQuery and Snowflake was 15 and 13 seconds per query, respectively.
ADX completed all 19 queries in under 10 seconds with 50 concurrent users.
BigQuery and Snowflake both had eight queries (different ones) that did not complete within the two-minute timeout.

Based on our testing and cost information, we found that Azure Data Explorer produced a 97% price-performance advantage over both Snowflake and Google BigQuery. In our tests of 50 concurrent users, Azure Data Explorer produced a $0.48 per hour run rate, compared to BigQuery and Snowflake, which produced rates of $17.71 and $16.51 per hour, respectively.

Keep in mind that tests are configured to get the best from each platform according to publicly documented best practices. Optimizations on both platforms would be possible as their offerings evolve or internal tests point to different configurations. Besides simply performance, we gathered some interesting takeaways from our study:

The Azure Data Explore query language KQL was by far the simplest and easiest syntax to write. It has thoughtful and useful functions for an analyst (like arg_max()) not found or easily written in SQL.
BigQuery was an efficient data loader and a decent performer, but it would be an expensive solution for log analytics where a slot commitment would be required—otherwise, the on-demand rate of $5/TB of scanned data would be outrageously expensive for queries.
Snowflake was easy to use. Both loading and running the queries was a simple, straightforward process. However, during the 50 concurrent-user tests, it would only run 5-8 queries at a time and queue the rest. We even set the Maximum Concurrent Queries system parameter to its maximum allowed value of 32, and this did not change Snowflake’s self-protective behavior. To account for this, we could have enabled up to 10 multi-clusters to handle the concurrent workload. However, the test costs would have been up to 10x as well and given Snowflake an unfair advantage over the other platforms.

All in all, we found Azure Data Explorer (Kusto) to be a great performer, with lowest cost, and an easy-to-use, feature-rich solution for log analytics.

6. Disclaimer

Performance is important, but it is only one criterion for a business-critical database platform selection. This test is a point-in-time check into specific performance. There are numerous other factors to consider in selection across factors of Administration, Integration, Workload Management, User Interface, Scalability, Vendor, Reliability, and numerous other criteria. In our experience, performance changes over time and is competitively different for different workloads. Also, a performance leader can hit up against the point of diminishing returns, and viable contenders can quickly close the gap.

GigaOm runs all of its performance tests to strict ethical standards. The report results are the objective findings of the application of queries to the simulations described. The report clearly defines the selected criteria and process used to establish the field test. It also clearly states the data set sizes, the platforms, the queries, etc., that were used. The reader is left to determine how to qualify the information for their individual needs. The report does not make any claim regarding the third-party certification. It presents the objective results received from applying the process to the criteria as described in the report. The report strictly measures performance and does not purport to evaluate other factors that potential customers may find relevant when making a purchase decision.

This is a sponsored report. Microsoft chose the competitors, the test, and the Azure Data Explorer configuration. GigaOm chose the most compatible configurations for the other tested platform and ran the testing workloads. Choosing compatible configurations is subject to judgment. We have attempted to describe our decisions in this paper.

7. Appendix

Table 5 provides a description of the queries used for the test for each of the three platforms. For those wishing to replicate the tests the query syntax of each follows the table.

Table 5: Summary of Queries

Query	Description
1	Performs a simple count of terms in a case-insensitive search
2	Performs a simple count of a rare found term in a case-insensitive search
3	Performs an advanced text search using both “contains” and “starts with” qualifiers over a short time period in a case-insensitive search
4	Performs a full text search across multiple columns and fetches a sample
5	Finds session logs from a list of Client Request IDs
6	Performs an aggregation by a calculated column based on multi-value lookup in a case-sensitive search
7	Extracts Exception type and Message from error traces
8	Aggregates by a single low cardinality dimension
9	Performs an aggregation by two dimensions and getting top 50 results
10	Performs an aggregation by multiple dimensions, including the high cardinality column Client Request ID
11	Performs an aggregation over data resulting from sub-query, resolves nodes that produce most of errors, and drill into their logs distribution per-level
12	Performs a distinct count of sessions that had warning traces over a time period
13	Performs an aggregation of dynamic properties and provides statistics of ingestion per source
14	Performs filtering, aggregation, and finds top values based on a dynamic column of semi-structured data
15	Finds top activities (by CPU consumption) and then gets analysis of the nodes performance over buckets of 5 minutes
16	Performs a top nested drill down into errors with 3 levels: Source (all), Nodes (top 3), and components (top 3 + others)
17	Finds the 10 tables with the top-10 count of ingested rows for a specific Source
18	Performs an aggregation based on the dynamic property, and then drills down into the top 10 tables by row count
19	Performs a join on a calculated key and checks if there is any file was downloaded and completed by two different activities (i.e., look for a bug)
Source: GigaOm 2022

Query 1 – Count the appearance of a common term

This query performs a simple count of terms in a case-insensitive search.

ADX

    Logs
    | where Timestamp between(datetime(2014-03-08) .. 12h)
    | where Level == 'Warning' and Message has 'enabled'
    | count

BigQuery

    select
      count(*)
    from
      gigaom-microsoftadx-2022.logs100tb.logs
    where
      Timestamp between
        timestamp '2014-03-08'
        and timestamp_add(timestamp '2014-03-08', interval 12 hour)
      and Level = 'Warning'
      and regexp_contains(Message, r"(?i)(\b|_)enabled(\b|_)")

Snowflake

    select
      count(*)
    from
      logs_c
    where
      Timestamp between
        timestamp '2014-03-08'
        and timestampadd(hour, 12, to_date('2014-03-08'))
      and Collate(Level, '') = 'Warning'
      and regexp_like(Message, '.*enabled.*', 'is');

Query 2 – Count the appearance of a rare term

This query performs a simple count of a rare found term in a case-insensitive search.

ADX

    Logs
    | where Level == 'Error'
    | where Message has 'safeArrayRankMismatch'
    | count

BigQuery

    select
      count(*)
    from
      gigaom-microsoftadx-2022.logs100tb.logs
    where
      Level = 'Error'
      and regexp_contains(Message, r"(?i)(\b|_)safeArrayRankMismatch(\b|_)");

Snowflake

    select
      count(*)
    from
      logs_c
    where
      Level = 'Error'
      and regexp_like(Message, '.*SafeArrayRankMismatch.*', 'is');

Query 3 – Advanced text search using “contains” and “starts with”

This query performs an advanced text search using both “contains” and “starts with” qualifiers over a short time period in a case-insensitive search.

ADX

    Logs
    | where Timestamp between(datetime(2014-03-8 03:00:00) .. 1h)
    | where Source startswith 'im' and Message contains 'PONSE'
    | summarize Count=count() by Component

BigQuery

    select
      Component,
      count(*)
    from
      gigaom-microsoftadx-2022.logs100tb.logs
    where
      Timestamp between
        timestamp '2014-03-08 03:00:00'
        and timestamp_add(timestamp '2014-03-08 03:00:00', interval 1 hour)
      and starts_with(lower(Source), 'im')
      and lower(Message) like '%ponse%'
    group by
      Component
    ;

Snowflake

    select
      Component,
      count(*)
    from
      logs_c
    where
      Timestamp between
        to_timestamp('2014-03-08 03:00:00')
        and timestampadd(hour, 1, to_timestamp('2014-03-08 03:00:00'))
      and startswith(Source, collate('im', 'en-ci'))
      and contains(lower(Message), 'ponse')
    group by
      Component
    order by
      Component;

Query 4 – Full text search across multiple columns and fetch a sample

This query performs a full text search across multiple columns and fetches a sample.

ADX

    Logs
    | where Timestamp between(datetime(2014-03-08 03:00:00) .. 1h)
    | where * has 'Internal'
    | top 1000 by Timestamp

BigQuery

    select
      *
    from
      gigaom-microsoftadx-2022.logs100tb.logs
    where
      Timestamp between
        timestamp '2014-03-08 03:00:00'
        and timestamp_add(timestamp '2014-03-08 03:00:00', interval 1 hour)
      and (
        regexp_contains(Source, r"(?i)(\b|_)Internal(\b|_)")
        or regexp_contains(Node, r"(?i)(\b|_)Internal(\b|_)")
        or regexp_contains(Level, r"(?i)(\b|_)Internal(\b|_)")
        or regexp_contains(Component, r"(?i)(\b|_)Internal(\b|_)")
        or regexp_contains(ClientRequestId, r"(?i)(\b|_)Internal(\b|_)")
        or regexp_contains(Message, r"(?i)(\b|_)Internal(\b|_)")
      )
    order by
      Timestamp
    limit 1000;

Snowflake

    select
      *
    from
      logs_c
    where
      Timestamp between
        to_timestamp('2014-03-08 03:00:00')
        and timestampadd(hour, 1,to_timestamp('2014-03-08 03:00:00'))
      and (
        regexp_like(Source,'.*\\bInternal\\b.*', 'is')
        or regexp_like(Node,'.*\\bInternal\\b.*', 'is')
        or regexp_like(Level,'.*\\bInternal\\b.*', 'is')
        or regexp_like(Component,'.*\\bInternal\\b.*', 'is')
        or regexp_like(ClientRequestId,'.*\\bInternal\\b.*', 'is')
        or regexp_like(Message,'.*\\bInternal\\b.*', 'is')
      )
    order by
      Timestamp
    limit 1000;

Query 5 – Find session logs by Client Request ID

This query finds session logs from a list of Client Request IDs.

ADX

    Logs
    | where ClientRequestId in (
    'd71ab629-ebaf-9380-5fe8-942541387ce5',
    '6bb29a30-ce0d-1288-36f0-27dbd57d66b0',
    '1f82e290-a7c4-ac84-7117-52209b3b9c91',
    'ecc12181-8c5a-4f87-1ca3-712b4a82c4f0',
    'd275a6f0-ba1d-22cf-b06b-6dac508ece4b',
    'f0565381-29db-bf73-ca1b-319e80debe1c',
    '54807a9a-e442-883f-6d8b-186c1c2a1041',
    'f1d10647-fc31-dbc3-9e25-67f68a6fe194')
    | count

BigQuery

    select
      count(*)
    from
      gigaom-microsoftadx-2022.logs100tb.logs
    where
      ClientRequestId in (
        'd71ab629-ebaf-9380-5fe8-942541387ce5',
        '6bb29a30-ce0d-1288-36f0-27dbd57d66b0',
        '1f82e290-a7c4-ac84-7117-52209b3b9c91',
        'ecc12181-8c5a-4f87-1ca3-712b4a82c4f0',
        'd275a6f0-ba1d-22cf-b06b-6dac508ece4b',
        'f0565381-29db-bf73-ca1b-319e80debe1c',
        '54807a9a-e442-883f-6d8b-186c1c2a1041',
        'f1d10647-fc31-dbc3-9e25-67f68a6fe194'
      )

Snowflake

    select
      count(*)
    from
      logs_c
    where
      ClientRequestId in (
        'd71ab629-ebaf-9380-5fe8-942541387ce5',
        '6bb29a30-ce0d-1288-36f0-27dbd57d66b0',
        '1f82e290-a7c4-ac84-7117-52209b3b9c91',
        'ecc12181-8c5a-4f87-1ca3-712b4a82c4f0',
        'd275a6f0-ba1d-22cf-b06b-6dac508ece4b',
        'f0565381-29db-bf73-ca1b-319e80debe1c',
        '54807a9a-e442-883f-6d8b-186c1c2a1041',
        'f1d10647-fc31-dbc3-9e25-67f68a6fe194'
      );

Query 6 – Aggregate a calculated column based on a multi-value lookup

This query performs an aggregation by a calculated column based on multi-value lookup in a case-sensitive search.

ADX

    Logs
    | where Timestamp between(datetime(2014-03-08) .. 10d)
    | where Source in ('IMAGINEFIRST0', 'HAVINGCOLUMN182', 'THEREFORESTORE156', 'HOSTNODES207')
    | extend LogType = case(Component in ('CLOUDREPORTSERVER', 'COMMON1', 'FABRICINTEGRATOR', 'REQUESTPROTECTION', 'DIRECTORYSERVICE', 'REPORTSERVERSERVICETRACE', 'ACONFIGURATION', 'EXPLORESERVICEWATCHDOG', 'COMMUNICATIONRUNTIME'), 'Security',
                            Component in ('REPORTNETWORKING', 'PUSHDATASERVICETRACE', 'HEALTHSERVICE', 'UTILS', 'PROVIDERSCOMMON'), 'Performance',
                            Component in ('WORKERSERVICECONTENT', 'XMLACOMMON', 'INTEGRATIONDATABASE', 'DATABASEMANAGEMENT'), 'Ingestion',
                            'Other')
    | summarize Count=count() by LogType

BigQuery

    select
      case
        when Component in ('CLOUDREPORTSERVER', 'COMMON1', 'FABRICINTEGRATOR', 'REQUESTPROTECTION', 'DIRECTORYSERVICE', 'REPORTSERVERSERVICETRACE', 'ACONFIGURATION', 'EXPLORESERVICEWATCHDOG', 'COMMUNICATIONRUNTIME') then 'Security'
        when Component in ('REPORTNETWORKING', 'PUSHDATASERVICETRACE', 'HEALTHSERVICE', 'UTILS', 'PROVIDERSCOMMON') then 'Performance'
        when Component in ('WORKERSERVICECONTENT', 'XMLACOMMON', 'INTEGRATIONDATABASE', 'DATABASEMANAGEMENT') then 'Ingestion'
        else 'Other'
      end as LogType,
      count(*)
    from
      gigaom-microsoftadx-2022.logs100tb.logs
    where
      Timestamp between
        timestamp '2014-03-08'
        and timestamp_add(timestamp '2014-03-08', interval 10 day)
      and Source in ('IMAGINEFIRST0', 'HAVINGCOLUMN182', 'THEREFORESTORE156', 'HOSTNODES207')
    group by
      LogType;

Snowflake

    select
      case
        when Component in ('CLOUDREPORTSERVER', 'COMMON1', 'FABRICINTEGRATOR', 'REQUESTPROTECTION', 'DIRECTORYSERVICE', 'REPORTSERVERSERVICETRACE', 'ACONFIGURATION', 'EXPLORESERVICEWATCHDOG', 'COMMUNICATIONRUNTIME') then 'Security'
        when Component in ('REPORTNETWORKING', 'PUSHDATASERVICETRACE', 'HEALTHSERVICE', 'UTILS', 'PROVIDERSCOMMON') then 'Performance'
        when Component in ('WORKERSERVICECONTENT', 'XMLACOMMON', 'INTEGRATIONDATABASE', 'DATABASEMANAGEMENT') then 'Ingestion'
        else 'Other'
      end as LogType,
      count(*)
    from
      logs_c
    where
      Timestamp between
        to_timestamp('2014-03-08')
        and timestampadd(day, 10, to_timestamp('2014-03-08'))
      and Source in ('IMAGINEFIRST0', 'HAVINGCOLUMN182', 'THEREFORESTORE156', 'HOSTNODES207')
    group by
      LogType;

Query 7 – Extract exception types and messages from error traces

This query extracts Exception type and Message from error traces.

ADX

    Logs
    | where Timestamp between(datetime(2014-03-08 12:00) .. 1h)
    | where Level=='Error' and  Message startswith 'exception'
    | parse Message with 'Exception=' ExeptionType ';' * 'Message=' ExceptionMessage ';' *
    | summarize Count=count() by ExeptionType, ExceptionMessage
    | top 10 by Count

BigQuery

    select
      regexp_extract(Message, 'Exception=(.*);') as ExceptionType,
      regexp_extract(Message, 'Message=(.*);') as ExceptionMessage,
      count(*) as ExceptionCount
    from
      gigaom-microsoftadx-2022.logs100tb.logs
    where
      Timestamp between timestamp '2014-03-08 12:00:00' and timestamp_add(timestamp '2014-03-08 12:00:00', interval 1 hour)
      and Level = 'Error'
      and starts_with(lower(Message), 'exception')
    group by
      ExceptionType,
      ExceptionMessage
    order by
      ExceptionCount desc
    limit 10;

Snowflake

    select
      regexp_substr(Message, 'Exception=(.*);', 1, 1, 'e', 1) as ExceptionType,
      regexp_substr(Message, 'Message=(.*);', 1, 1, 'e', 1) as ExceptionMessage,
      count(*) as ExceptionCount
    from
      logs_c
    where
      Timestamp between to_timestamp('2014-03-08 12:00:00') and timestampadd(hour, 1, to_timestamp('2014-03-08 12:00:00'))
      and Level = 'Error'
      and startswith(Message, collate('exception', 'en-ci'))
    group by
      ExceptionType,
      ExceptionMessage
    order by
      ExceptionCount desc
    limit 10;

Query 8 – Aggregate by a single low cardinality dimension

This query aggregates by a single low cardinality dimension.

ADX

    Logs
    | where Timestamp between(datetime(2014-03-08 12:00) .. 6h)
    | summarize Count=count() by Level

BigQuery

    select
      Level,
      count(*) as Count
    from
      gigaom-microsoftadx-2022.logs100tb.logs
    where
      Timestamp between
        timestamp '2014-03-08 12:00:00'
        and timestamp_add(timestamp '2014-03-08 12:00:00', interval 6 hour)
    group by
      Level;

Snowflake

    select
      Collate(Level, ''),
      count(*) as Count
    from
      logs_c
    where
      Timestamp between
        to_timestamp('2014-03-08 12:00:00')
        and timestampadd(hour, 6, timestamp '2014-03-08 12:00:00')
    group by
      Collate(Level, '');

Query 9 – Aggregate by two dimensions and get top results

This query performs an aggregation by two dimensions and getting top 50 results.

ADX

    Logs
    | where Timestamp between(datetime(2014-03-08 12:00) .. 6h)
    | summarize Count=count() by Level, Component
    | top 50 by Count

BigQuery

    select
      Level,
      Component,
      count(*) as Count
    from
      gigaom-microsoftadx-2022.logs100tb.logs
    where
      Timestamp between
        timestamp '2014-03-08 12:00:00'
        and timestamp_add(timestamp '2014-03-08 12:00:00', interval 6 hour)
    group by
      Level,
      Component
    order by
      Count desc
    limit 50;

Snowflake

    select
      Level,
      Component,
      count(*) as Count
    from
      logs_c
    where
      Timestamp between
        to_timestamp('2014-03-08 12:00:00')
        and timestampadd(hour, 6, timestamp '2014-03-08 12:00:00')
    group by
      Level,
      Component
    order by
      Count desc
    limit 50;

Query 10 – Aggregate by multiple dimensions including a high cardinality column

This query performs an aggregation by multiple dimensions, including the high cardinality column Client Request ID.

ADX

    Logs
    | where Timestamp between(datetime(2014-03-08) .. 3d)
    | where Source == 'IMAGINEFIRST0' and Message has 'downloaded'
    | summarize hint.shufflekey=ClientRequestId Count=count() by Component, Level, Node, ClientRequestId
    | top 10 by Count

BigQuery

    select
      Component,
      Level,
      Node,
      ClientRequestId,
      count(*) as Count
    from
      gigaom-microsoftadx-2022.logs100tb.logs
    where
      Timestamp between
        timestamp '2014-03-08 00:00:00'
        and timestamp_add(timestamp '2014-03-08 00:00:00', interval 3 day)
      and Source = 'IMAGINEFIRST0'
      and regexp_contains(Message, r"(?i)(\b|_)downloaded(\b|_)")
    group by
      Component,
      Level,
      Node,
      ClientRequestId
    order by
      Count desc
    limit 10;

Snowflake

    select
      Component,
      Level,
      Node,
      ClientRequestId,
      count(*) as Count
    from
      logs_c
    where
      Timestamp between
            to_timestamp('2014-03-08 00:00:00')
        and timestampadd(day, 3, timestamp '2014-03-08 12:00:00')
      and Source = 'IMAGINEFIRST0'
      and regexp_like(Message,'.*\\bDownloaded\\b.*', 'is')
    group by
      Component,
      Level,
      Node,
      ClientRequestId
    order by
      Count desc
    limit 10;

Query 11 – Resolve nodes that produce most of errors and drill down

This query performs an aggregation over data resulting from sub-query, resolves nodes that produce most of errors, and drill into their logs distribution per-level.

ADX

    let top_nodes =
        Logs
        | where Timestamp between(datetime(2014-03-08 12:00) .. 6h)
        | where Level == 'Error'
        | summarize count() by Node
        | top 10 by count_
        | project Node;
    Logs
    | where Timestamp between(datetime(2014-03-08 12:00) .. 6h)
    | where Node in (top_nodes)
    | summarize count() by Level, Node

BigQuery

    with top_nodes as (
    select
      Node
    from
      gigaom-microsoftadx-2022.logs100tb.logs
    where
      Timestamp between
        timestamp '2014-03-08 12:00:00'
        and timestamp_add(timestamp '2014-03-08 12:00:00', interval 6 hour)
      and Level = 'Error'
    group by
      Node
    order by
      count(*) desc
    limit 10
    )
    select
      Level,
      Node,
      count(*) as Count
    from
      gigaom-microsoftadx-2022.logs100tb.logs
    where
      Timestamp between
        timestamp '2014-03-08 12:00:00'
        and timestamp_add(timestamp '2014-03-08 12:00:00', interval 6 hour)
      and Node in (select Node from top_nodes)
    group by
      Level,
      Node;

Snowflake

    with top_nodes as (
    select
      Node
    from
      logs_c
    where
      Timestamp between
        to_timestamp('2014-03-08 12:00:00')
        and timestampadd(hour, 6, timestamp '2014-03-08 12:00:00')
      and Level = 'Error'
    group by
      Node
    order by
      count(*) desc
    limit 10
    )
    select
      Level,
      Node,
      count(*) as Count
    from
      logs_c
    where
      Timestamp between
        to_timestamp('2014-03-08 12:00:00')
        and timestampadd(hour, 6, timestamp '2014-03-08 12:00:00')
      and Node in (select Node from top_nodes)
    group by
      Level,
      Node;

Query 12 – Distinct count of sessions that had warning traces over a short period of time

This query performs a distinct count of sessions that had warning traces over a time period.

ADX

    Logs
    | where Timestamp between(datetime(2014-03-08 12:00) .. 6h)
    | where Level == 'Error'
    | summarize dcount(ClientRequestId) by bin(Timestamp, 1h)

BigQuery

    select
      timestamp_seconds(1*3600 * div(unix_seconds(Timestamp), 1*3600)) as bin,
      count(distinct ClientRequestId) as dcount
    from
      gigaom-microsoftadx-2022.logs100tb.logs
    where
      Timestamp between
        timestamp '2014-03-08 12:00:00'
        and timestamp_add(timestamp '2014-03-08 12:00:00', interval 6 hour)
      and Level = 'Error'
    group by
      bin
    order by
      bin;

Snowflake

    select
      time_slice(Timestamp, 1, 'HOUR') as bin,
      count(distinct ClientRequestId) as dcount
    from
      logs_c
    where
      Timestamp between
        to_timestamp('2014-03-08 12:00:00')
        and timestampadd(hour, 6, timestamp '2014-03-08 12:00:00')
      and Level = 'Error'
    group by
      bin
    order by
      bin;

Query 13 – Aggregate statistics of ingestion per source

This query performs an aggregation of dynamic properties and provides statistics of ingestion per source.

ADX

    Logs
    | where Timestamp between(datetime(2014-03-08 12:00) .. 6h)
    | where Message startswith 'IngestionCompletionEvent'
    | where Source in ('IMAGINEFIRST0', 'CLIMBSTEADY83', 'INTERNALFIRST79', 'WORKWITHIN77', 'ADOPTIONCUSTOMERS81', 'FIVENEARLY85',
                        'WHATABOUT98', 'PUBLICBRAINCHILD89', 'WATCHPREVIEW91', 'LATERYEARS87', 'GUTHRIESSCOTT93', 'THISSTORING16')
    | where Properties has 'parquet'
    | summarize MaxRowCount= max(tolong(Properties.rowCount)), percentiles(totimespan(Properties.duration), 50, 90, 95) by Source

BigQuery

    with PercentilesTable as (
    select
      Source,
      max(PropertiesStruct.rowCount) as MaxRowCount,
      approx_quantiles(time_diff(PropertiesStruct.duration, '00:00:00', microsecond) / 1000000, 100) percentiles
    from
      gigaom-microsoftadx-2022.logs100tb.logs_c
    where
      Timestamp between
        timestamp '2014-03-08 12:00:00'
        and timestamp_add(timestamp '2014-03-08 12:00:00', interval 6 hour)
      and starts_with(Message, 'IngestionCompletionEvent')
      and Source in ('IMAGINEFIRST0', 'CLIMBSTEADY83', 'INTERNALFIRST79', 'WORKWITHIN77', 'ADOPTIONCUSTOMERS81', 'FIVENEARLY85', 'WHATABOUT98', 'PUBLICBRAINCHILD89', 'WATCHPREVIEW91', 'LATERYEARS87', 'GUTHRIESSCOTT93', 'THISSTORING16')
      and regexp_contains(Properties, r"(?i)(\b|_)parquet(\b|_)")
    group by
      Source
    )
    select
      Source,
      MaxRowCount,
      percentiles[offset(50)] as p50,
      percentiles[offset(90)] as p90,
      percentiles[offset(95)] as p95
    from
      PercentilesTable
    order by
      MaxRowCount desc;

Snowflake

    select
      Source,
      max(cast(PropertiesStruct:rowCount as int)) as MaxRowCount,
      max(case when startswith(PropertiesStruct:duration, '1.')
      then timediff(microsecond, to_time('00:00:00'), to_time(regexp_replace(PropertiesStruct:duration, '^1\\.(.*)', '\\1', 1, 1, 'm'))) / 1000000 + 24*3600
      else timediff(microsecond, to_time('00:00:00'), to_time(regexp_replace(PropertiesStruct:duration, '^1\\.(.*)', '\\1', 1, 1, 'm'))) / 1000000
      end),
      percentile_cont(0.50) within group (order by case when startswith(PropertiesStruct:duration, '1.')
      then timediff(microsecond, to_time('00:00:00'), to_time(regexp_replace(PropertiesStruct:duration, '^1\\.(.*)', '\\1', 1, 1, 'm'))) / 1000000 + 24*3600
      else timediff(microsecond, to_time('00:00:00'), to_time(regexp_replace(PropertiesStruct:duration, '^1\\.(.*)', '\\1', 1, 1, 'm'))) / 1000000
      end) as percentiles50,
      percentile_cont(0.90) within group (order by case when startswith(PropertiesStruct:duration, '1.')
      then timediff(microsecond, to_time('00:00:00'), to_time(regexp_replace(PropertiesStruct:duration, '^1\\.(.*)', '\\1', 1, 1, 'm'))) / 1000000 + 24*3600
      else timediff(microsecond, to_time('00:00:00'), to_time(regexp_replace(PropertiesStruct:duration, '^1\\.(.*)', '\\1', 1, 1, 'm'))) / 1000000
      end) as percentiles90,
      percentile_cont(0.95) within group (order by case when startswith(PropertiesStruct:duration, '1.')
      then timediff(microsecond, to_time('00:00:00'), to_time(regexp_replace(PropertiesStruct:duration, '^1\\.(.*)', '\\1', 1, 1, 'm'))) / 1000000 + 24*3600
      else timediff(microsecond, to_time('00:00:00'), to_time(regexp_replace(PropertiesStruct:duration, '^1\\.(.*)', '\\1', 1, 1, 'm'))) / 1000000
      end) as percentiles95
    from
      logs_c
    where
      Timestamp between
        to_timestamp('2014-03-08 12:00:00')
        and timestampadd(hour, 6, timestamp '2014-03-08 12:00:00')
      and startswith(Message, 'IngestionCompletionEvent')
      and Source in ('IMAGINEFIRST0', 'CLIMBSTEADY83', 'INTERNALFIRST79', 'WORKWITHIN77', 'ADOPTIONCUSTOMERS81', 'FIVENEARLY85', 'WHATABOUT98', 'PUBLICBRAINCHILD89', 'WATCHPREVIEW91', 'LATERYEARS87', 'GUTHRIESSCOTT93', 'THISSTORING16')
      and regexp_like(Properties,'.*\\bparquet\\b.*', 'is')
    group by
      Source;

Query 14 – Filter and aggregate top values from semi-structured data

This query performs filtering, aggregation, and finds top values based on a dynamic column of semi-structured data.

ADX

    Logs
    | where Timestamp between(datetime(2014-03-08 12:00) .. 1h)
    | where Component == "DOWNLOADER"
    | summarize DownloadRate=max(todouble(Properties.compressedSize) / ((totimespan(Properties.downloadDuration) / 1sec))) by Source
    | top 10 by DownloadRate

BigQuery

    select
      Source,
      max(PropertiesStruct.compressedSize / time_diff(PropertiesStruct.downloadDuration, '00:00:00', microsecond) / 1000000) as DownloadRate
    from
      gigaom-microsoftadx-2022.logs100tb.logs_c
    where
      Timestamp between
        timestamp '2014-03-08 12:00:00'
        and timestamp_add(timestamp '2014-03-08 12:00:00', interval 1 hour)
      and Component = 'DOWNLOADER'
    group by
      Source
    order by
      DownloadRate desc
    limit 10;

Snowflake

    select
      Source,
      max(
        cast(PropertiesStruct:compressedSize as int) /
        ( case when startswith(PropertiesStruct:downloadDuration, '1.')
      then timediff(microsecond, to_time('00:00:00'), to_time(regexp_replace(PropertiesStruct:downloadDuration, '^1\\.(.*)', '\\1', 1, 1, 'm'))) / 1000000 + 24*3600
      else timediff(microsecond, to_time('00:00:00'), to_time(regexp_replace(PropertiesStruct:downloadDuration, '^1\\.(.*)', '\\1', 1, 1, 'm'))) / 1000000
      end
      )) as DownloadRate
    from
      logs_c
    where
      Timestamp between
        to_timestamp('2014-03-08 12:00:00')
        and timestampadd(hour, 1, timestamp '2014-03-08 12:00:00')
      and Component = 'DOWNLOADER'
    group by
      Source
    order by
      DownloadRate desc
    limit 10;

Query 15 – Find top activities and get analysis of node performance

This query finds top activities (by CPU consumption) and then gets analysis of the nodes performance over buckets of 5 minutes.

ADX

    let Data = Logs
        | where Source == 'IMAGINEFIRST0'
        | where Timestamp between(datetime(2014-03-08 12:00) .. 5d)
        | where Message startswith 'IngestionCompletionEvent';
    let TopNodesByCPU = Data
    | summarize MaxCPU = max(totimespan(Properties.cpuTime)) by Node
    | order by MaxCPU desc, Node desc
    | take 10
    | project Node;
    Data
    | where Node in (TopNodesByCPU)
    | summarize AverageProcessTimeInSeconds=avg(totimespan(Properties.cpuTime)) by bin(Timestamp, 5m), Node

BigQuery

    with TopNodesByCPU as (
      select
        Node,
        time_diff(PropertiesStruct.cpuTime, '00:00:00', microsecond) / 1000000 as cpuTime
      from
        gigaom-microsoftadx-2022.logs100tb.logs_c
      where
        Source = 'IMAGINEFIRST0'
        and Timestamp between
          timestamp '2014-03-08 12:00:00'
          and timestamp_add(timestamp '2014-03-08 12:00:00', interval 5 day)
        and starts_with(lower(Message), 'ingestioncompletionevent')
      group by
        Node,
        cpuTime
      order by
        cpuTime desc,
        Node desc
      limit 10
      )
    select
      Node,
      avg(time_diff(PropertiesStruct.cpuTime, '00:00:00', microsecond) / 1000000),
      timestamp_seconds(5*60 * div(unix_seconds(Timestamp), 5*60)) as bin
    from
      gigaom-microsoftadx-2022.logs100tb.logs_c
    where
      Node in (select Node from TopNodesByCPU)
      and Source = 'IMAGINEFIRST0'
      and Timestamp between
        timestamp '2014-03-08 12:00:00'
        and timestamp_add(timestamp '2014-03-08 12:00:00', interval 5 day)
      and starts_with(lower(Message), 'ingestioncompletionevent')
    group by
      Node,
      bin
    order by
      bin
    ;

Snowflake

    with TopNodesByCPU as (
      select
        Node,
        case when startswith(PropertiesStruct:cpuTime, '1.')
          then timediff(microsecond, to_time('00:00:00'), to_time(regexp_replace(PropertiesStruct:cpuTime, '^1\\.(.*)', '\\1', 1, 1, 'm'))) / 1000000 + 24*3600
          else timediff(microsecond, to_time('00:00:00'), to_time(regexp_replace(PropertiesStruct:cpuTime, '^1\\.(.*)', '\\1', 1, 1, 'm'))) / 1000000
        end as cpuTime
      from
        logs_c
      where
        Source = 'IMAGINEFIRST0'
        and Timestamp between
          to_timestamp('2014-03-08 12:00:00')
          and timestampadd(day, 5, timestamp '2014-03-08 12:00:00')
        and startswith(lower(Message), 'ingestioncompletionevent')
      group by
        Node,
        cpuTime
      order by
        cpuTime desc,
        Node desc
      limit 10
      )
    select
      Node,
      avg(case when startswith(PropertiesStruct:cpuTime, '1.')
        then timediff(microsecond, to_time('00:00:00'), to_time(regexp_replace(PropertiesStruct:cpuTime, '^1\\.(.*)', '\\1', 1, 1, 'm'))) / 1000000 + 24*3600
        else timediff(microsecond, to_time('00:00:00'), to_time(regexp_replace(PropertiesStruct:cpuTime, '^1\\.(.*)', '\\1', 1, 1, 'm'))) / 1000000
      end),
      time_slice(Timestamp, 5, 'MINUTE', 'START') as bin
    from
      logs_c
    where
      Node in (select Node from TopNodesByCPU)
      and Source = 'IMAGINEFIRST0'
      and Timestamp between
        to_timestamp('2014-03-08 12:00:00')
        and timestampadd(day, 5, timestamp '2014-03-08 12:00:00')
      and startswith(lower(Message), 'ingestioncompletionevent')
    group by
      Node,
      bin
    order by
      bin
    ;

Query 16 – Top nested drill down into 3 levels of errors

This query performs a top nested drill down into errors with 3 levels: Source (all), Nodes (top 3), and components (top 3 + others).

ADX

    Logs
    | where Timestamp between (datetime(2014-03-08) .. 3h)
    | where Level in ("Error")
    | summarize (FirstErrorTime, FirstErrorComponent)=arg_min(Timestamp, Component),
                (LastErrorTime, LastErrorComponent)=arg_max(Timestamp, Component),
                Errors=count() by Source
    | top 5 by Errors

BigQuery

    with ArgMax as (
      select
        Source,
        min(Timestamp) as MinTS,
        max(Timestamp) as MaxTS,
        count(*) as Errors
      from
        gigaom-microsoftadx-2022.logs100tb.logs
      where
        Timestamp between
            timestamp '2014-03-08'
            and timestamp_add(timestamp '2014-03-08', interval 3 hour)
        and Level = 'Error'
      group by
        Source
    )
    select
      ArgMax.Source,
      MinTS as FirstErrorTime,
      FirstComp.Component as FirstErrorComponent,
      MaxTS as LastErrorTime,
      LastComp.Component as LastErrorComponent,
      Errors
    from
      ArgMax
    left join
      gigaom-microsoftadx-2022.logs100tb.logs FirstComp
        on FirstComp.Source = ArgMax.Source and FirstComp.Timestamp = ArgMax.MinTS
    left join
      gigaom-microsoftadx-2022.logs100tb.logs LastComp
        on LastComp.Source = ArgMax.Source and LastComp.Timestamp = ArgMax.MaxTS
    order by
      Errors desc
    limit 5
    ;

Snowflake

    with ArgMax as (
      select
        Source,
        min(Timestamp) as MinTS,
        max(Timestamp) as MaxTS,
        count(*) as Errors
      from
        logs_c
      where
        Timestamp between
            to_timestamp('2014-03-08 00:00:00')
            and timestampadd(hour, 3, timestamp '2014-03-08 00:00:00')
        and Level = 'Error'
      group by
        Source
    )
    select
      ArgMax.Source,
      MinTS as FirstErrorTime,
      FirstComp.Component as FirstErrorComponent,
      MaxTS as LastErrorTime,
      LastComp.Component as LastErrorComponent,
      Errors
    from
      ArgMax
    left join
      logs_c FirstComp
        on FirstComp.Source = ArgMax.Source and FirstComp.Timestamp = ArgMax.MinTS
    left join
      logs_c LastComp
        on LastComp.Source = ArgMax.Source and LastComp.Timestamp = ArgMax.MaxTS
    order by
      Errors desc
    limit 5
    ;

Query 17 – Find top count of ingested rows for a specific source

This query finds the 10 tables with the top-10 count of ingested rows for a specific Source.

ADX

    let Data =
    Logs
    | where Timestamp between(datetime(2014-03-08 12:00) .. 3d)
    | where Source == 'IMAGINEFIRST0';
    Data
    | where Message startswith '$$IngestionCommand'
    | parse Message with '$$IngestionCommand table=' Table ' ' *
    | distinct hint.shufflekey=ClientRequestId ClientRequestId, Table
    | join kind=inner hint.shufflekey=ClientRequestId (
        Data
        | where Message startswith 'IngestionCompletionEvent'
    ) on ClientRequestId
    | summarize TotalRows = sum(tolong(Properties.rowCount)) by Table
    | top 10 by TotalRows

BigQuery

    with Data as (
    select
      *
    from
      gigaom-microsoftadx-2022.logs100tb.logs_c
    where
      Timestamp between
        timestamp '2014-03-08 12:00:00'
        and timestamp_add(timestamp '2014-03-08 12:00:00', interval 3 day)
      and Source = 'IMAGINEFIRST0'
    ),
    Data2 as (
    select
      distinct regexp_extract(Message, 'table=(.*) ') as icTable,
      ClientRequestId
    from
      Data
    where
      starts_with(lower(Message), '$$ingestioncommand')
    )
    select
      icTable,
      sum(l.PropertiesStruct.rowCount) as TotalRows
    from
      Data l
    inner join
      Data2 d
      on l.ClientRequestId = d.ClientRequestId
    where
      starts_with(l.Message, 'IngestionCompletionEvent')
    group by
      icTable
    order by
      TotalRows desc
    limit 10
    ;

Snowflake

    with Data as (
    select
      *
    from
      logs_c
    where
      Timestamp between
        to_timestamp('2014-03-08 12:00:00')
        and timestampadd(day, 3, timestamp '2014-03-08 12:00:00')
      and Source = 'IMAGINEFIRST0'
    ),
    Data2 as (
    select
      distinct regexp_substr(Message, 'table=\\w+') as icTable,
      ClientRequestId
    from
      Data
    where
      startswith(lower(Message), '$$ingestioncommand')
    )
    select
      icTable,
      sum(cast(l.PropertiesStruct:rowCount as int)) as TotalRows
    from
      Data l
    inner join
      Data2 d
      on l.ClientRequestId = d.ClientRequestId
    where
      startswith(l.Message, 'IngestionCompletionEvent')
    group by
      icTable
    order by
      TotalRows desc
    limit 10
    ;

Query 18 – Aggregate a dynamic property and drill-down

This query performs an aggregation based on the dynamic property, and then drills down into the top 10 tables by row count.

ADX

    let Data =
    Logs
    | where Timestamp between(datetime(2014-03-08 12:00) .. 3d)
    | where Source == 'IMAGINEFIRST0';
    Data
    | where Message startswith '$$IngestionCommand'
    | parse Message with '$$IngestionCommand table=' Table ' ' *
    | distinct hint.shufflekey=ClientRequestId ClientRequestId, Table
    | join kind=inner hint.shufflekey=ClientRequestId (
        Data
        | where Message startswith 'IngestionCompletionEvent'
    ) on ClientRequestId
    | extend Format=tostring(Properties.format), RowCount=tolong(Properties.rowCount)
    | top-nested of Format by max(1),
      top-nested 10 of Table by Rows=sum(RowCount)
    | project Format, Table, Rows

BigQuery

    with Data as (
    select
      *
    from
      gigaom-microsoftadx-2022.logs100tb.logs_c
    where
      Timestamp between
        timestamp '2014-03-08 12:00:00'
        and timestamp_add(timestamp '2014-03-08 12:00:00', interval 3 day)
      and Source = 'IMAGINEFIRST0'
    ),
    Data2 as (
    select
      distinct regexp_extract(Message, 'table=(.*) ') as icTable,
      ClientRequestId
    from
      Data
    where
      starts_with(lower(Message), '$$ingestioncommand')
    )
    select
      Format,
      icTable,
      RowCount
    from (
    select
      *,
      row_number() over (partition by Format order by RowCount desc) as rownum
    from (
    select
      l.PropertiesStruct.format as Format,
      icTable,
      sum(l.PropertiesStruct.rowCount) as RowCount
    from
      Data l
    inner join
      Data2 d
      on l.ClientRequestId = d.ClientRequestId
    where
      starts_with(l.Message, 'IngestionCompletionEvent')
    group by
      Format,
      icTable
    )
    )
    where
      rownum <= 10
    ;

Snowflake

    with Data as (
    select
      *
    from
      logs_c
    where
      Timestamp between
        to_timestamp('2014-03-08 12:00:00')
        and timestampadd(day, 3, timestamp '2014-03-08 12:00:00')
      and Source = 'IMAGINEFIRST0'
    ),
    Data2 as (
    select
      distinct regexp_substr(Message, 'table=(.*) ') as icTable,
      ClientRequestId
    from
      Data
    where
      startswith(lower(Message), '$$ingestioncommand')
    )
    select
      Format,
      icTable,
      RowCount
    from (
    select
      *,
      row_number() over (partition by Format order by RowCount desc) as rownum
    from (
    select
      l.PropertiesStruct:format as Format,
      icTable,
      sum(cast(l.PropertiesStruct:rowCount as int)) as RowCount
    from
      Data l
    inner join
      Data2 d
      on l.ClientRequestId = d.ClientRequestId
    where
      startswith(l.Message, 'IngestionCompletionEvent')
    group by
      Format,
      icTable
    )
    )
    where
      rownum <= 10
    ;

Query 19 – Join on a calculated key and look for a bug

This query performs a join on a calculated key and checks if there is any file was downloaded and completed by two different activities (i.e., look for a bug).

ADX

    let Data =
        Logs
        | where Timestamp between(datetime(2014-03-08 12:00) .. 3d)
        | where Source in ('IMAGINEFIRST0')
        ;
    let Downloading =
        Data
        | where Message startswith 'Downloading file path:'
        | parse Message with * 'path:' path
        | project DownloadClientRequestId = ClientRequestId, path;
    let IngestionCompletion =
        Data
        | where Message startswith 'IngestionCompletionEvent'
        | parse Message with * 'path:' path
        | project CompleteClientRequestId = ClientRequestId, path;
    Downloading
    | join hint.shufflekey=path kind=inner (IngestionCompletion) on path
    | where DownloadClientRequestId !=  CompleteClientRequestId
    | count

BigQuery

    with Data as (
    select
      *
    from
      gigaom-microsoftadx-2022.logs100tb.logs
    where
      Timestamp between
        timestamp '2014-03-08 12:00:00'
        and timestamp_add(timestamp '2014-03-08 12:00:00', interval 3 day)
      and Source in ('IMAGINEFIRST0')
    ),
    Downloading as (
    select
      regexp_extract(Message, 'path:(.*)') as Path,
      ClientRequestId as DownloadClientRequestId
    from
      Data
    where
      starts_with(Message, 'Downloading file path:')
    ),
    IngestionCompletion as (
    select
      regexp_extract(Message, 'path:(.*)') as Path,
      ClientRequestId as CompleteClientRequestId
    from
      Data
    where
      starts_with(Message, 'IngestionCompletionEvent')
    )
    select
      count(*)
    from
      Downloading d
    inner join
      IngestionCompletion ic
      on d.Path = ic.path
    where
      DownloadClientRequestId <> CompleteClientRequestId
    ;

Snowflake

    with Data as (
    select
      *
    from
      logs_c
    where
      Timestamp between
        to_timestamp('2014-03-08 12:00:00')
        and timestampadd(day, 3, timestamp '2014-03-08 12:00:00')
      and Source in ('IMAGINEFIRST0')
    ),
    Downloading as (
    select
      regexp_substr(Message, 'path:(.*)') as Path,
      ClientRequestId as DownloadClientRequestId
    from
      Data
    where
      startswith(Message, 'Downloading file path:')
    ),
    IngestionCompletion as (
    select
      regexp_substr(Message, 'path:(.*)') as Path,
      ClientRequestId as CompleteClientRequestId
    from
      Data
    where
      startswith(Message, 'IngestionCompletionEvent')
    )
    select
      count(*)
    from
      Downloading d
    inner join
      IngestionCompletion ic
      on d.Path = ic.path
    where
      DownloadClientRequestId <> CompleteClientRequestId
    ;

8. About Microsoft

Microsoft (Nasdaq “MSFT” @microsoft) enables digital transformation for the era of an intelligent cloud and an intelligent edge. Its mission is to empower every person and every organization on the planet to achieve more.

Microsoft offers Azure Data Explorer (ADX). To learn more about ADX visit https://azure.microsoft.com/en-us/services/data-explorer/.

9. About William McKnight

William McKnight is a former Fortune 50 technology executive and database engineer. An Ernst & Young Entrepreneur of the Year finalist and frequent best practices judge, he helps enterprise clients with action plans, architectures, strategies, and technology tools to manage information.

Currently, William is an analyst for GigaOm Research who takes corporate information and turns it into a bottom-line-enhancing asset. He has worked with Dong Energy, France Telecom, Pfizer, Samba Bank, ScotiaBank, Teva Pharmaceuticals, and Verizon, among many others. William focuses on delivering business value and solving business problems utilizing proven approaches in information management.

10. About Jake Dolezal

Jake Dolezal is a contributing analyst at GigaOm. He has two decades of experience in the information management field, with expertise in analytics, data warehousing, master data management, data governance, business intelligence, statistics, data modeling and integration, and visualization. Jake has solved technical problems across a broad range of industries, including healthcare, education, government, manufacturing, engineering, hospitality, and restaurants. He has a doctorate in information management from Syracuse University.

11. About GigaOm

GigaOm provides technical, operational, and business advice for IT’s strategic digital enterprise and business initiatives. Enterprise business leaders, CIOs, and technology organizations partner with GigaOm for practical, actionable, strategic, and visionary advice for modernizing and transforming their business. GigaOm’s advice empowers enterprises to successfully compete in an increasingly complicated business atmosphere that requires a solid understanding of constantly changing customer demands.

GigaOm works directly with enterprises both inside and outside of the IT organization to apply proven research and methodologies designed to avoid pitfalls and roadblocks while balancing risk and innovation. Research methodologies include but are not limited to adoption and benchmarking surveys, use cases, interviews, ROI/TCO, market landscapes, strategic trends, and technical benchmarks. Our analysts possess 20+ years of experience advising a spectrum of clients from early adopters to mainstream enterprises.

GigaOm’s perspective is that of the unbiased enterprise practitioner. Through this perspective, GigaOm connects with engaged and loyal subscribers on a deep and meaningful level.

Table of Contents

1. Executive Summary

2. Log Analytics Platforms Tested

Azure Data Explorer (ADX)

BigQuery

Snowflake

Elasticsearch/OpenSearch

3. Field Test Setup

GigaOm Log Analytics Field Test

Test Environments

Field Test Data

Loading the Dataset

Field Test Execution

4. Field Test Results

Issues with Elasticsearch/OpenSearch

5. Conclusion

6. Disclaimer

7. Appendix

Query 1 – Count the appearance of a common term

ADX

BigQuery

Snowflake

Query 2 – Count the appearance of a rare term

ADX

BigQuery

Snowflake

Query 3 – Advanced text search using “contains” and “starts with”

ADX

BigQuery

Snowflake

Query 4 – Full text search across multiple columns and fetch a sample

ADX

BigQuery

Snowflake

Query 5 – Find session logs by Client Request ID

ADX

BigQuery

Snowflake

Query 6 – Aggregate a calculated column based on a multi-value lookup

ADX

BigQuery

Snowflake

Query 7 – Extract exception types and messages from error traces

ADX

BigQuery

Snowflake

Query 8 – Aggregate by a single low cardinality dimension

ADX

BigQuery

Snowflake

Query 9 – Aggregate by two dimensions and get top results

ADX

BigQuery

Snowflake

Query 10 – Aggregate by multiple dimensions including a high cardinality column

ADX

BigQuery

Snowflake

Query 11 – Resolve nodes that produce most of errors and drill down

ADX

BigQuery

Snowflake

Query 12 – Distinct count of sessions that had warning traces over a short period of time

ADX

BigQuery

Snowflake

Query 13 – Aggregate statistics of ingestion per source

ADX

BigQuery

Snowflake

Query 14 – Filter and aggregate top values from semi-structured data

ADX

BigQuery

Snowflake

Query 15 – Find top activities and get analysis of node performance

ADX

BigQuery

Snowflake

Query 16 – Top nested drill down into 3 levels of errors

ADX