Top data analytics tools comparison: Alibaba Cloud, AWS, Azure, Google Cloud, IBM

5 Mins read
Data Analytics

Cloud computing is booming and has become the foundation for digital businesses.

Now, it is difficult to find an organization that does not use cloud services. Be it web server, development tools, operating systems, data storage, or individual application capabilities – the cloud offers it all.

A recent cloud adoption statistic by Gartner, Inc tells the same story. It says the worldwide public cloud services is expected to go nearly $331.2 bn in 2022, which is at three time the growth of overall IT services.

That said, the cloud seems to keep growing in the foreseeable future.

With the increasing use of the cloud, it is also important for organizations to understand how their systems in the cloud will operate. Thus, enterprises are looking for ways to set up and manage their incoming data in real-time to achieve better insights and make better business decisions.

This is exactly where cloud-based data analytics solutions come in, as they use advanced analysis techniques to represent the data into clear visualizations that can be further synchronized and shared across key employees.

Data analytics: turning enterprise data to value

The data analytics market is driven by various emerging leaders. In this post, we are including some of the major players operating in this market: Alibaba Cloud, Amazon Web Services (AWS), Google Cloud, IBM, and Microsoft Azure. Each of them has its own set of benefits, but the best will depend on the end goals of your organization. We will compare them to help you go for the best for your particular situation.

  • Alibaba Data Lake Analytics

Alibaba Data Lake Analytics (DLA) is a serverless, high performance and interactive query service that quickly collects, stores and handles the flowing data, and turns it into actionable insights.

By using this, customers can perform complex analytics of different formats of data from multiple sources to develop new data insights. Moreover, users can easily and reliably process millions of events for making real-time decisions like social analytics, fraud detection, and more.

Supported language elements: It supports standard SQL language and BI tools to analyze the data.

Integration and input sources: It can connect with multiple sources with the relevant configuration settings.

Price: The price of Alibaba analytics service is billed on actual use and needs of data users.

See diagram of Alibaba’s data lake analytics in the image below:

Image: Alibaba

Read reviews of Data Lake Analytics: Gartner

  • AWS Kinesis Data Analytics

Amazon’s Kinesis Data Analytics is a massively scalable and durable real-time service for data absorption, analysis, and delivery. It can continuously collect gigabytes of data per second from multiple sources.

Users can capture the large stores of data in milliseconds to solve streaming data problems as fast as possible, such as anomaly detection, dynamic pricing, and more.

With Kinesis, data consumers can solve a variety of data streaming problems. Typically, Kinesis streams can load the aggregate data into the data warehouses or data lakes (AWS data stores), including application logs, IoT telemetry data, website click data streams, social media streams, etc., to ensure durability and elasticity.

Supported language elements: It works on standard SQL language with some extensions to perform operations on streaming data.

Integration and input sources: It supports inputs from the Kinesis data stream and Kinesis data firehose delivery stream. Further, it analyzes data using BI (Business Intelligence) tools.

Price: The price of Kinesis Data Analytics depends on the volume of data you ingest, store and consume through the service.

The image below shows the high-level architecture of Kinesis:

Image: Amazon

Read reviews of Kinesis Data Analytics: Gartner.

  • Google Cloud Dataflow

Google’s Cloud Dataflow is a serverless, highly efficient, fully-managed service that allows you to process huge amounts of data and analyze it on a real-time basis. This helps you derive insights and calculate meaningful analytics over your streaming data.

Using this model, users can efficiently perform analytics, as well as implement multi-step processing pipelines, monitor its execution, and get advanced alerting to identify and respond quickly to complex issues.

Supported language elements:  It can connect with various types of data sources and supports Java, Python, and Scala language with others to follow. It also supports queries from SQL through Google BigQuery.

Integration and input sources: Dataflow support streaming transfers from cloud storage accounts such as Pub/Sub. Its service for Apache Beam integrates natively with Apache Kafka via Google BigQuery.

Price: The price of Google Dataflow varies based on different services you choose.

See the data transformation of Google Dataflow in the image below:

Image: Google

Read reviews of Google Dataflow: Gartner.

  • IBM Streaming Analytics

IBM’s Streaming Analytics enables users to extract value from data in motion, reduce infrastructure costs and get faster insights and alerts. Essentially, streaming analytics is known to help companies of all sizes by handling millions of high rate events and messages per second.

Users can complement information that comes from different applications, for example, transaction processing to spot threats and opportunities and make real-time decisions.

Supported language elements: It supports applications in SPL, Java, Scala, Apache Beam and Python language.

Integration and input sources: The solution ingests data from a variety of sources, including IBM Event Streams, HTTP, and Internet of Things (IoT), and connects with data streaming sources and systems configured to perform analytics.

Price: IBM streaming analytics comes along with its cloud services. See the detailed price here.

See the infrastructure in the image below:

Image: IBM

Read reviews of Streaming Analytics: Gartner.

  • Microsoft Azure Stream Analytics

Microsoft’s Azure Stream Analytics is a very popular fully managed real-time data analytics service for complex event processing. It enables users to unlock actionable insights from a wide range of data.

Users can examine huge volumes of data that they miss in manual mode. Moreover, users can detect anomalies easily such as spikes or dips, predict positive or negative trends through online learning and scoring models. They can further store the information for later investigation or use the patterns for quick action.

Streaming platform: Azure Event Hubs.

Supported language elements: It works on simple SQL based query language with language extensibility capability via JavaScript user-defined functions (UDFs) or user-defined aggregates, that enables users to perform difficult business calculations.

Integration and input sources: It can connect with multiple IoT devices, and support inputs from Azure Event Hubs, Azure IoT Hub and Azure Blob Storage.

Outputs: You can get the output results in any of the following: Event Hub, Azure Function, Service Bus, SQL Server, Cosmos DB, Blob or Table storage, Data Lake, as well as a streaming Power BI dashboard.

Price: The price of Azure Stream Analytics is based on the number of streaming units required by a user to process the complex data into service.

See the roles of the services within the architecture in the image below:

Image: Microsoft

Read reviews of Azure Stream Analytics: Gartner.

Azure Stream AnalyticsAWS Kinesis
Data Analytics
Google Cloud DataflowIBM Streaming AnalyticsAlibaba Data Lake Analytics
ProgrammabilityStream analytics query language, JavaScriptData analytics query language, standard SQLJava, Python and a distributed compute platformJava, Scala and PythonStandard SQL
Programming modelDeclarativeFlink programming model, DeclarativeApache BeamStreams Processing Language (SPL), DeclarativeDeclarative
Pricing model

Streaming unitsHourly rate based on the average streaming unitsBased on Google Compute Engine (GCE) costs plus an additional charge per vCPU per minuteSubscription basedBased on the number of bytes scanned
InputsAzure Event Hubs, Azure IoT Hub, Azure Blob storageData sources through SQL JOINS: Streaming data sources like Kinesis Data Streams and reference data sources like Amazon S3Cloud Storage and PubSubFile, Transmission Control Protocol (TCP) and User Datagram Protocol (UDP) Alibaba Cloud Object Storage Service (OSS), PostgreSQL, MySQL, NoSQL (Table Store) and ApsaraDB, using DLA and Quick BI
SinksAzure Data Lake Store, Azure SQL Database, Storage Blobs, Event Hubs, Power BI, Table Storage, Service Bus Queues, Service Bus Topics, Cosmos DB, Azure FunctionsAmazon Kinesis Data Streams, Amazon Kinesis Data Firehose, Amazon DynamoDB, and Amazon S3 (through file sink integrations)Cloud Storage, BigQuery, BigTable, PubSub, Datastore, etc.TCP network connection, UDP network connection and User-defined Sink OperatorNA
Built-in temporal/windowing supportYesYesNAYesNA
Input data formatsAvro, JSON or CSV, UTF-8 encodedJSON, CSV, and TSVAVRO, CSV, JSONJSONJSON, Vector and other multi-media resources
ScalabilityQuery partitionsShardsShardsHorizontal partitionsHorizontal partitions
Late arrival and out of order event handling supportYesYesYesNAYes

Due to high competition in the cloud data analytics space, it is getting difficult for organizations to choose one from a variety of options that provide almost similar services. We’ve tried to make it easier for you. Go through the comparison and tell us which one you like the most in the comments section.

READ NEXT: Comparing IoT services: AWS vs Google vs IBM vs Microsoft


Disclaimer: The information contained in this article is for general information purpose only. Price and product information are subject to change. This information has been sourced from the websites and relevant resources available in the public domain of the named vendors as on 26th November, 2019. Wire19 News makes best endeavors to ensure that the information is accurate and up to date, however, it does not warrant or guarantee that anything written here is 100% accurate, timely, or relevant to the website visitors.

Leave a Reply

Your email address will not be published. Required fields are marked *

+ 5 = 15