Big data is
collection of large and complex data that cannot be processed by using normal
data processing systems. Analytics refers to getting meaningful patterns from
an unstructured or semi-structured data. Now Big Data Analytics is processing
large data (Unstructured, semi-structured and structured) to discover unknown
correlations, hidden data patterns, consumer preferences, market trends and
other useful business information.
The results of analysis can be used for improving operational
efficiency, better targeting of potential customer, new avenues for revenue
generation, gaining competitive edge over rival firms and better customer
service. The main aim of big data analytics is to help various firms to take
more informed business decisions by enabling analytics professionals and data
scientists to analyze and interpret huge
volumes of data which is untapped by
traditional business intelligence programs.
Big data encompasses internet clicks information, server
logs, Content in Social media,various Social network reports, ,mobile phone
call records, customer survey responses and market research data etc. Usually
only unstructured and semi-structured data is associated with big data, but
consulting companies like Forrester and Gartner consider structured data also
as a valid component of big data.
Technologies in Big
Data Analytics
Some software tools that are used to analyze big data are based
on predictive analytics .There are new class of technologies such as Hadoop and
related tools such as MapReduce, YARN, Spark, Pig and NoSQL databases which can process Big
data. These technologies form crucial part of an open source software that
supports the processing of large and diverse data.
Hadoop
Hadoop is a open source Java-based programming framework that
supports the processing of huge data sets in a distributed computing
environment. It is part of Apache Software Foundation. It was inspired by
Google's MapReduce, a software
framework in which any application is divided into numerous small parts. These
parts (also called fragments or blocks) can be run on any node (a connection
point in a network) in the cluster.
Illustration of Big Data Analytics
Big Data Analytics solution by LogicMatter
Let
us now illustrate how big data analytics solution is implemented using Hadoop. LogicMatter is a low cost Big Data
Analytics solution provider. Here traditional
(e.g. ODS, EDW) and emerging (Hadoop MapReduce) analytical tools are combined
to operate on big data. The data platform is built on the powerful and flexible
Amazon Web Services Cloud platform. To capture, process, store and transform
data, Hadoop is used with the LogicMatter-designed Analytical Data Store (ADS).
File-based storage service of Hadoop is used by platform for flexible and fast data
processing .This big data analytics platform enables the continuous delivery of
both real-time and historical analytics via the popular Tableau.
The Analytics platform and solutions is built specifically to
solve some complex customer problems such as clickstream analytics, video
analytics, sales performance analysis, fraud detection and financial analytics.
Data Sources
This platform enables to collect, process, store, and
transform both unstructured and structured data exclusively for analytical purposes.
It can quickly process various varieties of unstructured data including text, documents, weblogs, XML files, Excel,
audio & video, call logs), clickstream, and event data. It can also simultaneously
process structured data from familiar enterprise data sources such as CRM, ERP
and SQL Databases.
The data collection process is separated from transformation
and analysis. It allows us to easily add data sources of unknown and known kind
without impacting the analysis, a huge challenge with present analytics
solutions. Transformation of data is delayed till you need to do the analysis
wastage and reducing upfront costs.
Data Platform
The AWS data platform consists of two primary components:
-Hadoop Cluster.
-LogicMatter-designed ADS (Analytical Data
Service).
The flexible and scalable Hadoop technology is used to
collect both structured and unstructured data. The data collected is integrated,
pre-processed and stored in ADS. The flat file-based storage system of Hadoop
allows you to scale quickly as well as handle large amounts of known and unknown
data. Hadoop is an integrated and intermediate data source and acts as a feeder
to the ADS.The data from Hadoop is mapped and transformed to develop a data
model. The model built iteratively and stored in the ADS forms the basis for a powerful
analytics. ADS uses traditional data warehouse technology – Cubes, and OLAP.
Hence, it supports all the powerful and traditional analytical techniques that
you are used to (dashboards ,reports, scorecards etc.).
Visualization
One of the unique design features of LogicMatter’s Big Data
Analytics services is to enable continuous analytics both real-time and
historical .As there is an integrated data discovery platform ,the
visualization tool is directly connected to either Hadoop or ADS for development
the analytics. Ad-hoc queries can be run against Hadoop for exploratory
analytics and instant data access . For
the standard, canned reports and dashboards, you connect to the ADS to gain a historical
perspective.
The data
platform is so flexible that you can easily connect any of your favorite visualization
tools (such as Qlikview ,Excel,).
Testimonials Of Big
Data Analytics
There are several examples of how bigger, better, faster,
stronger applications, analytics, sensors, and networks are creating results
with big data today across various industries.
1. The Financial Services
Industry
The financial services industry uses big data to make better
financial decisions. Banking gaint Morgan Stanley ran into issues doing portfolio analysis on
some traditional databases and now uses Hadoop to analyze investments on a
larger scale and with better results. Hadoop is also used in the industry
for sentiment
analysis, financial
trades
and predictive analytics.
2. The Automotive Industry
Ford’s modern hybrid Fusion model generates up to 25 GB of
data per hour. Data obtained can be used to understand driving behaviors, reduce
accidents, understand wear and tear to identify issues that lower maintenance
costs, avoid accidents, and even confirm travelling
arrangements.
3. Supply Chain And Logistics
Companies like Union Pacific Railroad use thermometers and ultrasound
to capture data about their engines and send it for analysis to identify
equipment at risk for failure if any. The world’s largest multi-carrier network for the ocean shipping
industry-INTTRA uses it’s OceanMetricsapplication to allow shippers and
carriers to measure their own performance. Companies are also using telematics and big data
to streamline
trucking fleets . GE
believes these types new capabilities can contribute
$15 trillion to the global GDP by 2030 by using systematic and data-driven analysis.
4. Retail
Walmart is
using big data from 10 different websites to feed shoppers transaction data into
analytical devices. Sears and Kmart are trying to improve the
personalization of marketing campaigns and offers with big data to compete
better with Wal-Mart and Target.
Practical Big Data
Benefits
Develop Target Markets
By analyzing the various customers purchasing orders,
companies can now know better about customers who are buying their products.
Companies can accordingly target on those customers.
Customize your website
in real time
Thorough big data analytics companies can personalize their
websites and portals based on gender, location and nationality of customers and
offer them tailored recommendations .The best example for this is Amazon’s use item-based, collaborative filtering (IBCF).Amazon
uses
Features such as “Customers who bought this item also bought”
and “frequently bought together” to reach more customers. Amazon could generate
more revenue through these methods.
Create new revenue
streams
The insights that a company obtain from analyzing market and
consumers with Big Data are not just valuable to that company. Firms could sell
them as non-personalized trend data to large industry players operating in the
same segment and create a whole new revenue stream.
There are many companies like Bloomberg and Analytics
Quotient which sell the analyzed information to other companies and generate
revenues.
Reducing maintenance
costs
Factories estimate that a certain type of equipment is likely
to wear out after some years. So, they replace every piece of that technology
within that many years. Big Data tools do away with such unpractical and costly
practices. Massive amounts of data that they access and use and their
unequalled speed can spot failing devices and predict the depreciation time.
This results in a much more cost-effective replacement strategy for the utility
as faulty devices are tracked a lot faster now.
Offering
enterprise-wide insights
Previously when business users needed to analyze large
amounts of varied data, they had to ask their IT colleagues for help as they
themselves lacked the technical expertise. But by the time they received the
requested information, it was no longer useful or even correct. Now With Big
Data tools, the technical teams can do the groundwork and then build algorithms for faster searches. They can
develop systems and install interactive and dynamic visualization tools that
allow business users to analyze, view and benefit from the crucial data.
Making Smart Cities
To deal with the consequences of their fast expansion, more number of smart cities are indeed leveraging
Big Data tools for the benefit of their citizens. Oslo in Norway, for instance,
reduced street lighting energy consumption by 62% with a smart solution. The
Memphis Police Department started using predictive software in 2006 and has
been able to reduce serious crime by 30 %. Portland city in Oregon, used
technology to optimize the timing of its traffic signals and was able to
eliminate more CO2 emissions in just six years.
References:
http://searchbusinessanalytics.techtarget.com/definition/big-data-analytics
http://dermatological/big-data-analytics-services-solutions/
http://blog.pivotal.io/pivotal/news-2/20-examples-of-getting-results-with-big-data
http://en.wikipedia.org/wiki/Apache_Hadoop
The article is written by B.kiran kumar.He is currently a PGP first year student of IIM Raipur.He has 3.9 years of experience at virtusa.