June 15, 2015

Big Data Analytics

Big data is collection of large and complex data that cannot be processed by using normal data processing systems. Analytics refers to getting meaningful patterns from an unstructured or semi-structured data. Now Big Data Analytics is processing large data (Unstructured, semi-structured and structured) to discover unknown correlations, hidden data patterns, consumer preferences, market trends and other useful business information.

The results of analysis can be used for improving operational efficiency, better targeting of potential customer, new avenues for revenue generation, gaining competitive edge over rival firms and better customer service. The main aim of big data analytics is to help various firms to take more informed business decisions by enabling analytics professionals and data scientists  to analyze and interpret huge volumes of data  which is untapped by traditional business intelligence programs.
                


Big data encompasses internet clicks information, server logs, Content in Social media,various Social network reports, ,mobile phone call records, customer survey responses and market research data etc. Usually only unstructured and semi-structured data is associated with big data, but consulting companies like Forrester and Gartner consider structured data also as a valid component of big data.

Technologies in Big Data Analytics

Some software tools that are used to analyze big data are based on predictive analytics .There are new class of technologies such as Hadoop and related tools such as MapReduce, YARN, Spark, Pig and NoSQL databases which can process Big data. These technologies form crucial part of an open source software that supports the processing of large and diverse data.

Hadoop
Hadoop is a open source Java-based programming framework that supports the processing of huge data sets in a distributed computing environment. It is part of Apache Software Foundation. It was inspired by Google's MapReduce, a software framework in which any application is divided into numerous small parts. These parts (also called fragments or blocks) can be run on any node (a connection point in a network) in the cluster.

Illustration of Big Data Analytics
         

                                       Big Data Analytics solution by LogicMatter

        Let us now illustrate how big data analytics solution is implemented using Hadoop. LogicMatter is a low cost Big Data Analytics solution provider.  Here traditional (e.g. ODS, EDW) and emerging (Hadoop MapReduce) analytical tools are combined to operate on big data. The data platform is built on the powerful and flexible Amazon Web Services Cloud platform. To capture, process, store and transform data, Hadoop is used with the LogicMatter-designed Analytical Data Store (ADS). File-based storage service of Hadoop is used by platform for flexible and fast data processing .This big data analytics platform enables the continuous delivery of both real-time and historical analytics via the popular Tableau.
The Analytics platform and solutions is built specifically to solve some complex customer problems such as clickstream analytics, video analytics, sales performance analysis, fraud detection and financial analytics.

Data Sources
This platform enables to collect, process, store, and transform both unstructured and structured data exclusively for analytical purposes. It can quickly process various varieties of unstructured data including  text, documents, weblogs, XML files, Excel, audio & video, call logs), clickstream, and event data. It can also simultaneously process structured data from familiar enterprise data sources such as CRM, ERP and SQL Databases.
The data collection process is separated from transformation and analysis. It allows us to easily add data sources of unknown and known kind without impacting the analysis, a huge challenge with present analytics solutions. Transformation of data is delayed till you need to do the analysis wastage and reducing upfront costs.

Data Platform
The AWS data platform consists of two primary components:
 -Hadoop Cluster.
 -LogicMatter-designed ADS (Analytical Data Service).

The flexible and scalable Hadoop technology is used to collect both structured and unstructured data. The data collected is integrated, pre-processed and stored in ADS. The flat file-based storage system of Hadoop allows you to scale quickly as well as handle large amounts of known and unknown data. Hadoop is an integrated and intermediate data source and acts as a feeder to the ADS.The data from Hadoop is mapped and transformed to develop a data model. The model built iteratively and stored in the ADS forms the basis for a powerful analytics. ADS uses traditional data warehouse technology – Cubes, and OLAP. Hence, it supports all the powerful and traditional analytical techniques that you are used to (dashboards ,reports, scorecards etc.).     

Visualization
One of the unique design features of LogicMatter’s Big Data Analytics services is to enable continuous analytics both real-time and historical .As there is an integrated data discovery platform ,the visualization tool is directly connected to either Hadoop or ADS for development the analytics. Ad-hoc queries can be run against Hadoop for exploratory analytics and instant  data access . For the standard, canned reports and dashboards, you connect to the ADS to gain a historical perspective.
The data platform is so flexible that you can easily connect any of your favorite visualization tools (such as Qlikview ,Excel,).

Testimonials Of Big Data Analytics

There are several examples of how bigger, better, faster, stronger applications, analytics, sensors, and networks are creating results with big data today across various industries.

1. The Financial Services Industry
The financial services industry uses big data to make better financial decisions. Banking gaint Morgan Stanley ran into issues doing portfolio analysis on some traditional databases and now uses Hadoop to analyze investments on a larger scale and with better results. Hadoop is also used in the industry for sentiment analysis, financial trades and predictive analytics.

2. The Automotive Industry 
Ford’s modern hybrid Fusion model generates up to 25 GB of data per hour. Data obtained can be used to understand driving behaviors, reduce accidents, understand wear and tear to identify issues that lower maintenance costs, avoid accidents, and even confirm travelling arrangements.

3. Supply Chain And Logistics
Companies like Union Pacific Railroad use thermometers and ultrasound to capture data about their engines and send it for analysis to identify equipment at risk for failure if any. The world’s largest  multi-carrier network for the ocean shipping industry-INTTRA uses it’s OceanMetricsapplication to allow shippers and carriers to measure their own performance. Companies are also using telematics and big data to streamline trucking fleets . GE believes these types new capabilities can contribute $15 trillion to the global GDP by 2030 by using systematic and data-driven analysis.

4. Retail
Walmart is using big data from 10 different websites to feed shoppers transaction data into analytical devices. Sears and Kmart are trying to improve the personalization of marketing campaigns and offers with big data to compete better with Wal-Mart and Target.

Practical Big Data Benefits

Develop Target Markets
By analyzing the various customers purchasing orders, companies can now know better about customers who are buying their products. Companies can accordingly target on those customers.

Customize your website in real time
Thorough big data analytics companies can personalize their websites and portals based on gender, location and nationality of customers and offer them tailored recommendations .The best example for this is Amazon’s use  item-based, collaborative filtering (IBCF).Amazon uses
Features such as “Customers who bought this item also bought” and “frequently bought together” to reach more customers. Amazon could generate more revenue through these methods.

Create new revenue streams
The insights that a company obtain from analyzing market and consumers with Big Data are not just valuable to that company. Firms could sell them as non-personalized trend data to large industry players operating in the same segment and create a whole new revenue stream.
There are many companies like Bloomberg and Analytics Quotient which sell the analyzed information to other companies and generate revenues.

Reducing maintenance costs
Factories estimate that a certain type of equipment is likely to wear out after some years. So, they replace every piece of that technology within that many years. Big Data tools do away with such unpractical and costly practices. Massive amounts of data that they access and use and their unequalled speed can spot failing devices and predict the depreciation time. This results in a much more cost-effective replacement strategy for the utility as faulty devices are tracked a lot faster now.

Offering enterprise-wide insights
Previously when business users needed to analyze large amounts of varied data, they had to ask their IT colleagues for help as they themselves lacked the technical expertise. But by the time they received the requested information, it was no longer useful or even correct. Now With Big Data tools, the technical teams can do the groundwork and then build  algorithms for faster searches. They can develop systems and install interactive and dynamic visualization tools that allow business users to analyze, view and benefit from the crucial data.

Making Smart Cities
To deal with the consequences of their fast expansion, more  number of smart cities are indeed leveraging Big Data tools for the benefit of their citizens. Oslo in Norway, for instance, reduced street lighting energy consumption by 62% with a smart solution. The Memphis Police Department started using predictive software in 2006 and has been able to reduce serious crime by 30 %. Portland city in Oregon, used technology to optimize the timing of its traffic signals and was able to eliminate more CO2 emissions in just six years.


References:

http://searchbusinessanalytics.techtarget.com/definition/big-data-analytics
http://dermatological/big-data-analytics-services-solutions/
http://blog.pivotal.io/pivotal/news-2/20-examples-of-getting-results-with-big-data
http://en.wikipedia.org/wiki/Apache_Hadoop


 The article is written by B.kiran kumar.He is currently a PGP first year student of IIM Raipur.He has 3.9 years of experience at virtusa.