Directions Hadoop is Moving In

Hadoop is a data system so big it is like a virtual jumbo where your PC is a flea. One of the developers named it after his kid?s toy elephant so there is no complicated acronym to stumble over. The system is actually conceptually simple. It has loads of storage capacity and an unusual way of processing data. It does not wait for big files to report in to its software. Instead, it takes the processing system to the data.

The next question is what to do with Hadoop. Perhaps the question would be better expressed as, what can we do with a wonderful opportunity that we could not do before. Certainly, Hadoop is not for storing videos when your laptop starts complaining. The interfaces are clumsy and Hadoop belongs in the realm of large organisations that have the money. Here are two examples to illustrate the point.

Hadoop in Healthcare

In the U.S., healthcare generates more than 150 gigabytes of data annually. Within this data there are important clues that online training provider DeZyre believes could lead to these solutions:

  • Personalised cancer treatments that relate to how individual genomes cause the disease to mutate uniquely
  • Intelligent online analysis of life signs (blood pressure, heart beat, breathing) in remote children?s hospitals treating multiple victims of catastrophes
  • Mining of patient information from health records, financial status and payroll data to understand how these variables impact on patient health
  • Understanding trends in healthcare claims to empower hospitals and health insurers to increase their competitive advantages.
  • New ways to prevent health insurance fraud by correlating it with claims histories, attorney costs and call centre notes.

Hadoop in Retail

The retail industry also generates a vast amount of data, due to consumer volumes and multiple touch points in the delivery funnel. Skillspeed business trainers report the following emerging trends:

  • Tracing individual consumers along the marketing trail to determine individual patterns for different demographics and understand consumers better.
  • Obtaining access to aggregated consumer feedback regarding advertising campaigns, product launches, competitor tactics and so on.
  • Staying with individual consumers as they move through retail outlets and personalising their experience by delivering contextual messages.
  • Understanding the routes that virtual shoppers follow, and adding handy popups with useful hints and tips to encourage them on.
  • Detecting trends in consumer preferences in order to forecast next season sales and stock up or down accordingly.

Where to From Here?

Big data mining is akin to deep space research in that we are exploring fresh frontiers and discovering new worlds of information. The future is as broad as our imagination.?

Contact Us

  • (+353)(0)1-443-3807 – IRL
  • (+44)(0)20-7193-9751 – UK

Check our similar posts

The Better Way of Applying Benford’s Law for Fraud Detection

Applying Benford’s Law on large collections of data is an effective way of detecting fraud. In this article, we?ll introduce you to Benford’s Law, talk about how auditors are employing it in fraud detection, and introduce you to a more effective way of integrating it into an IT solution.

Benford’s Law in a nutshell

Benford’s Law states that certain data sets – including certain accounting numbers – exhibit a non-uniform distribution of first digits. Simply put, if you gather all the first digits (e.g. 8 is the first digit of ?814 and 1 is the first digit of ?1768) of all the numbers that make up one of these data sets, the smallest digits will appear more frequently than the larger ones.

That is, according to Benford’s Law,

1 should comprise roughly 30.1% of all first digits;
2 should be 17.6%;
3 should be 12.5%;
4 should be 9.7%, and so on.

Notice that the 1s (ones) occur far more frequently than the rest. Those who are not familiar with Benford’s Law tend to assume that all digits should be distributed uniformly. So when fraudulent individuals tinker with accounting data, they may end up putting in more 9s or 8s than there actually should be.

Once an accounting data set is found to show a large deviation from this distribution, then auditors move in to make a closer inspection.

Benford’s Law spreadsheets and templates

Because Benford’s Law has been proven to be effective in discovering unnaturally-behaving data sets (such as those manipulated by fraudsters), many auditors have created simple software solutions that apply this law. Most of these solutions, owing to the fact that a large majority of accounting departments use spreadsheets, come in the form of spreadsheet templates.

You can easily find free downloadable spreadsheet templates that apply Benford’s Law as well as simple How-To articles that can help you to implement the law on your own existing spreadsheets. Just Google “Benford’s law template” or “Benford’s law spreadsheet”.

I suggest you try out some of them yourself to get a feel on how they work.

The problem with Benford’s Law when used on spreadsheets

There’s actually another reason why I wanted you to try those spreadsheet templates and How-To’s yourself. I wanted you to see how susceptible these solutions are to trivial errors. Whenever you work on these spreadsheet templates – or your own spreadsheets for that matter – when implementing Benford’s Law, you can commit mistakes when copy-pasting values, specifying ranges, entering formulas, and so on.

Furthermore, some of the data might be located in different spreadsheets, which can likewise by found in different departments and have to be emailed for consolidation. The departments who own this data will have to extract the needed data from their own spreadsheets, transfer them to another spreadsheet, and send them to the person in-charge of consolidation.

These activities can introduce errors as well. That’s why we think that, while Benford’s Law can be an effective tool for detecting fraud, spreadsheet-based working environments can taint the entire fraud detection process.

There?s actually a better IT solution where you can use Benford’s Law.

Why a server-based solution works better

In order to apply Benford’s Law more effectively, you need to use it in an environment that implements better controls than what spreadsheets can offer. What we propose is a server-based system.

In a server-based system, your data is placed in a secure database. People who want to input data or access existing data will have to go through access controls such as login procedures. These systems also have features that log access history so that you can trace who accessed which and when.

If Benford’s Law is integrated into such a system, there would be no need for any error-prone copy-pasting activities because all the data is stored in one place. Thus, fraud detection initiatives can be much faster and more reliable.

You can get more information on this site regarding the disadvantages of spreadsheets. We can also tell you more about the advantages of server application solutions.

How DevOps oils the Value Chain

DevOps ? a clipped compound of development and operations – is a way of working whereby software developers are in a team with project beneficiaries. A client centred approach extends the project plan to include the life cycle of the product or service, for which the software is developed.

We can then no longer speak of a software project for say Joe?s Accounting App. The software has no intrinsic value of its own. It follows that the software engineers are building an accounting app product. This is a small, crucially important distinction, because they are no longer in a silo with different business interests.

To take the analogy further, the developers are no longer contractors possibly trying to stretch out the process. They are members of Joe?s accounting company, and they are just as keen to get to market fast as Joe is to start earning income. DevOps uses this synergy to achieve the overarching business goal.

A Brief Introduction to OpsDev

You can skip this section if you already read this article. If not then you need to know that DevOps is a culture, not a working method. The three ?members? are the software developers, the beneficiaries, and a quality control mechanism. The developers break their task into smaller chunks instead of releasing the code to quality control as a single batch. As a result, the review process happens contiguously along these simplified lines.

Code QC Test ? ? ?
? Code QC Test ? ?
? ? Code QC Test ?
? ? ? Code QC Test
Colour Key Developers Quality Control Beneficiary

This is a marked improvement over the previously cumbersome method below.

Write the Code ? Test the Code ? Use the Code
? Evaluate, Schedule for Next Review ?

Working quickly and releasing smaller amounts of code means the OpsDev team learns quickly from mistakes, and should come to product release ahead of any competitor using the older, more linear method. The shared method of working releases huge resources in terms of user experience and in-line QC practices. Instead of being in a silo working on its own, development finds it has a richer brief and more support from being ?on the same side of the organisation?.

The Key Role that Application Program Interfaces Play

Application Program Interfaces, or API?s for short, are building blocks for software applications. Using proprietary software-bridges speeds this process up. A good example would be the PayPal applications that we find on so many websites today. API?s are not just for commercial sites, and they can reduce costs and improve efficiency considerably.

The following diagram courtesy of TIBCO illustrates how second-party applications integrate with PayPal architecture via an API fa?ade.

Working quickly and releasing smaller amounts of code means the OpsDev team learns quickly from mistakes, and should come to product release ahead of any competitor using the older, more linear method. The shared method of working releases huge resources in terms of user experience and in-line QC practices. Instead of being in a silo working on its own, development finds it has a richer brief and more support from being ?on the same side of the organisation?.

imgd2.jpg

The DevOps Revolution Continues ?

We close with some important insights from an interview with Jim Stoneham. He was general manager of the Yahoo Communities business unit, at the time Flickr became a part. ?Flickr was a codebase,? Jim recalls, ?that evolved to operate at high scale over 7 years – and continuing to scale while adding and refining features was no small challenge. During this transition, it was a huge advantage that there was such an integrated dev and ops team?

The ?maturity model? as engineers refer to DevOps status currently, enables developers to learn faster, and deploy upgrades ahead of their competitors. This means the client reaches and exceeds break-even sooner. DevOps lubricates the value chain so companies add value to a product faster. One reason it worked so well with Flickr, was the immense trust between Dev and Ops, and that is a lesson we should learn.

?We transformed from a team of employees to a team of owners. When you move at that speed, and are looking at the numbers and the results daily, your investment level radically changes. This just can’t happen in teams that release quarterly, and it’s difficult even with monthly cycles.? (Jim Stoneham)

Contact Us

  • (+353)(0)1-443-3807 – IRL
  • (+44)(0)20-7193-9751 – UK
Understanding Carbon Emissions

Carbon emission is one of the hottest issues in the world of energy and environment today. While it is supposedly an essential component of the ecosystem, it has already become a large contributing factor to climate change. Carbon emission might be good but abuse of this natural process has made it harmful to people across the globe.

This series of articles aims to help people understand the intricacies of carbon emission and what society can do to efficiently manage this natural occurrence.

Natural Carbon Cycle

Two important elements in the carbon cycle are carbon, which is present in every living thing all over the world; and oxygen, which is found in the air that people breathe. When these two bond together, they create a colourless and odourless greenhouse gas known as carbon dioxide, which is then crucial to trapping infrared radiation heat in the atmosphere and also for weathering rocks.

Carbon is not only found in the atmosphere of the earth. It is also an element found in oceans, plants, coal deposits, oil and natural gas from deep down the earth?s core. Through the carbon cycle, carbon moves naturally from one portion of the earth to another. Looking at this scenario, one can see that the natural carbon cycle is a healthy way to release carbon dioxide into the air in order to be absorbed again by trees and plants.

Altered Carbon Cycle

The natural circulation of carbon among the atmosphere is vital to humankind. However, studies show that humans misuse this natural cycle and abuse it instead. Whenever people burn fossil fuels such as coal, oil and natural gas, they produce carbon dioxide ? which is an excess addition to the natural flow of carbon in the environment. The problem is that the release of carbon dioxide is much more than what plants and trees can re-absorb. People are not only adding CO2 to the atmosphere, they are also influencing the ability of natural sinks, such as forests, to remove it from the atmosphere. Humans alter the carbon cycle by contributing doubled or tripled greenhouse gas to the atmosphere, faster than nature can ever eliminate. Worst, nature?s balance is destroyed.

The Result

Greenhouse gases include carbon dioxide, methane, nitrous oxide, fluorinated gas and other gases. Although these gasses contribute to climate change, carbon dioxide is the largest greenhouse gas that humans emit. The reason why people talk about carbon emissions most, is because we produce more carbon dioxide than any other greenhouse gas.

The increasing amount of carbon emissions cause global warming to become more evident. All the extra carbon dioxide causes the earth?s overall temperature to rise as well. As the temperature increases, climate also changes unpredictably. Flood, droughts, heat waves and hurricanes are now widely experienced even in places where these phenomenon never used to happen.

To be able to reduce the risk of more severe weather conditions means burning less fossil fuels and shifting more to renewable sources. This is never easy. But, definitely, it’s worth a try.

Ready to work with Denizon?