Harpeeet is an experienced IT consultant with strong strategic, analytic, architectural and leadership skills. His has broad experience in IT management, architecture and has lead teams in various projects. He specializes in IT architecture, Program governance, IT roadmaps and strategy.
He is positive, results-driven and innovative individual with proven success in balancing operational synergies and business growth with client satisfaction, offering over 13 years’ experience in management and architecture positions in world-class organisations within the IT industry.
He is presently working as Program Architect for Department of Attorney General & Justice. He has been involved in architecture road map for the overall design and is working to establish synergy between various programs to be hosted in cloud environment.
Harpreet has passion for IT Strategy and Architecture, Adventure Sports and Travelling.
He can be contacted on firstname.lastname@example.org
In recent conversations among business and IT people the latest buzz word that I have come across is “Big Data”.
Big data is a buzzword, or catch-phrase, used to describe a massive volume of data that is so large that it’s difficult to process using traditional database and software techniques. In most enterprise scenarios the data is too big or it moves too fast or it exceeds current processing capacity. The term big data, especially when used by vendors, may refer to the technology (which includes tools and processes) that an organization requires handling the large amounts of data and storage facilities.
The amount of data in our world has been exploding, and analyzing large data sets will become a key basis of competition, underpinning new waves of productivity growth, innovation, and consumer surplus. The increasing volume and detail of information captured by enterprises, the rise of multimedia, social media, and the Internet will fuel exponential growth in data for the foreseeable future.
So, What is Big Data?
Big data is a relative term describing a situation where the volume, velocity, variety and complexity of data exceed an organization’s storage or compute capacity for accurate and timely decision making. Big data is defined less by volume – which is a constantly moving target – than by its ever-increasing variety, velocity, variability and complexity.
Many factors contribute to the increase in data volume. Transaction-based data stored through the years. Unstructured data streaming in from social media. Increasing amounts of sensor and machine-to-machine data being collected. In the past, excessive data volume was a storage issue. But with decreasing storage costs, other issues emerge, including how to determine relevance within large data volumes and how to use analytics to create value from relevant data.
Data is streaming in at unprecedented speed and must be dealt with in a timely manner. RFID tags, sensors and smart metering are driving the need to deal with torrents of data in near-real time. Reacting quickly enough to deal with data velocity is a challenge for most organizations.
Data today comes in all types of formats. Structured, numeric data in traditional databases. Information created from line-of-business applications. Unstructured text documents, email, video, audio, stock ticker data and financial transactions. In addition to the speed at which data comes your way, the data flows can be highly variable – with daily, seasonal and event-triggered peak loads that can be challenging to manage.
The economic value of different data varies significantly. Typically there is good information hidden amongst a larger body of non-traditional data; the challenge is identifying what is valuable and then transforming and extracting that data for analysis.
Difficulties dealing with data increase with the expanding universe of data sources and are compounded by the need to link, match and transform data across business entities and systems. Organizations need to understand relationships, such as complex hierarchies and data linkages, among all data.
A data environment can become extreme along any of the above dimensions or with a combination of two or all of them at once. However, it is important to understand that not all of your data will be relevant or useful. Organizations must be able to separate the wheat from the chaff and focus on the information that counts – not on the information overload. Big Data technologies describe a new generation of technologies and architectures, designed to economically extract value from very large volumes of a wide variety of data, by enabling high-velocity capture, discovery, and/or analysis.
What are the sources of Big Data?
The sources of data growth that are driving Big Data technology investments vary widely. Some represent entirely new data sources, while others are a change in the “resolution” of existing data generated. New data sources for Big Data include industries that just recently began to digitize their content. In virtually all of these cases, data growth rates in the past five years have been near infinite, since in most cases it started from zero. Industries include:
Media and Entertainment industry: The media/entertainment industry moved to digital recording, production, and delivery in the past five years and is now collecting large amounts of rich content and user viewing behaviours.
Healthcare: The healthcare industry is quickly moving to electronic medical records and images, which it wants to use for short-term public health monitoring and long-term epidemiological research programs.
Life sciences: Low-cost gene sequencing (<$1,000) can generate tens of terabytes of information that must be analysed to look for genetic variations and potential treatment effectiveness.
Video surveillance: Video surveillance is still transitioning from CCTV to IPTV cameras and recording systems that organizations want to analyse for behavioural patterns (security and service enhancement).
Transportation, Logistics, Retail and Telecommunications: Sensor data is being generated at an accelerating rate from fleet GPS transceivers, RFID tag readers, smart meters, and cell phones (including mobile browsing data); that data is used to optimize operations and drive operational business intelligence (BI) to realize immediate business opportunities.
Consumers are increasingly active participants in a self-service marketplace that not only records the use of affinity cards but can increasingly be combined with social networks and location-based metadata, which creates a gold mine of actionable consumer data for retailers, distributors, and manufacturers of consumer packaged goods. Social media solutions such as Facebook and Twitter are the newest new data sources. A number of new businesses are now building Big Data environments, which leverage consumer’s nearly continuous streams of data about themselves (e.g., likes, locations, and opinions). Data growth patterns like these exceed the capabilities of Moore’s law and drive the need for hyper scale computing infrastructures that balance performance with density, energy efficiency, and cost.
Why big data should matter to you?
The real issue is not that you are acquiring large amounts of data. It’s what you do with the data that counts. The hopeful vision is that organizations will be able to take data from any source, harness relevant data and analyse it to find answers that enable the following issues:
Below image shows how the IT landscape is designed for maximizing the value of big data.
For instance, by combining big data and high-powered analytics, it is possible to:
Big data is not that easy to establish
Although envisioning how big data projects can benefit your business is relatively easy, making them happen is anything but. The challenges become evident as soon as you start the process and only get more complex over time. Consider what you need to factor into your plans:
Sizing and Budget
Building out a big data infrastructure with the compute, storage, security and network bandwidth to handle large, growing volumes of data can be costly. How much infrastructure do you need? Should you prepare for a proof of concept or a full-scale implementation? What kind of growth do you anticipate? Do you have enough data centre capacity and, if not, do you have the budget to expand?
Three key variables will affect your infrastructure sizing requirements: use cases, data sources and data retention.
If any of these grow significantly in scope, you may find yourself scrambling to expand your system.
Your users must be able to access and analyse data from distant and disparate sources as though it were local. Having sufficient compute power and an infrastructure sized to your application are critical for rapid analysis. In addition, the closer the data centre is to users, the better the performance. Is your infrastructure adequate and your data centre close enough? Can your users reliably access what they need when they need it?
In this economy, every day you’re not applying big data practices is a day you’re losing ground to competitors who are. Your IT department may be able to stand up a test environment in a few months, but you’ll likely have to go in the queue for production development. And each time you need to change, you may have to go back in the queue and wait out the procurement cycle. How much time are you willing to let pass?
In-house skills versus external consultant skills
Your existing IT team may be able to handle a big data proof of concept project. However, the skills required for production-grade big data management and analysis are a different engagement altogether. Finding big data professionals to handle a production initiative is far from easy. According to McKinsey & Co., demand for those professionals in the U.S. alone will exceed the available supply by 140,000 to 190,000 positions by 2018.
Getting ready for Big Data
A lot of organizations agree that they don’t have an ability to meet their analytics needs, while few organizations plan additional hiring to do so, according to the American Management Association. The majority of the organizations plan to invest in training to meet their capabilities gaps. Human Resources and Sales are seen as lagging in analytical skills when compared with other organizational functions. It is found that organizations lack of resources and corporate culture are the biggest impediments to an organization’s ability to leverage big data. Below are few quick tips that organisations need to focus in order to get up for big data wave.
Identify analytical needs for your organisation: Assess your workforce for analytical capabilities and use that data to determine where to focus first. Any departments that fall well below where the acceptable level is should be dealt with first, but if all else is equal, work on increasing the analytical abilities of top leaders either through executive development or recruitment.
Invest and build analytical strengths in your organisation: To build analytical acumen, training should focus on using data to make better decisions rather than on specific tools and data-crunching techniques. This type of training will help employees approach problems from a more empirical point of view. Some functions within your organization already may have the needed skills and can be tapped as subject matter experts to help educate others.
Prepare to manage the flow of big data: The hubbub regarding big data is mostly about that first word: big. If organizations are planning on making use of the enormous data sets available to them, infrastructure must be in place beforehand. Enterprise-wide expensive applications may or may not be able to leverage the massive amounts of data collected, so it’s important to understand what you are hoping to find before plunging into the overwhelming current of big data.
Embrace the analytical decision making mindset: Changing from an instinctual, experience-based decision-making organization to a data-driven one isn’t as simple as increasing your organization’s analytical abilities. The very way in which problems are viewed has to be changed, which is why it is so important to have leaders who understand and use data-based/evidence-based deision-making. Merely having more data accomplishes nothing if that data isn’t used to make better, more fact-based decisions.
Big Data creates infrastructure challenges that extend from data creation to data collection to data storage and analysis. Big Data represents both big opportunities and big challenges for CIO’s. Almost every CIO dreams about making IT a more valued asset to the organization. Big Data projects are at the frontier of the business, where most of the significant business expansion or cost reduction opportunities lie. Taking a lead in Big Data efforts provides the CIO with a chance to be a strategic partner with the business unit. Because speed is strategically important in most Big Data efforts, it will be tempting for business unit teams to move forward without IT support. Your IT team needs to recognize that it must think differently (as well as quickly) and fight for a seat at the table as Big Data strategies are developed. CIO’s need to both understand what their organization is planning around Big Data and begin developing a Big Data IT infrastructure strategy. In doing so, CIO’s need to understand the potential value Big Data represents to their organization and industry.