Are Master Data Management and Hadoop a Good Match?

Master Data is the critical electronic information about the company we cannot afford to lose. Accordingly, we should sanitise it, look after it, and store it safely in several separate places that are independent of each other. The advent of Big Data introduced the current era of huge repositories ?in the clouds?. They are not, of course but at least they are remote. This short article includes a discussion about Hadoop, and whether this is a good platform to back up your Master Data.

About Hadoop

Hadoop is an open-source Apache software framework built on the assumption that hardware failure is so common that backups are unavoidable. It comprises a storage area and a management part that distributes the data to smaller nodes where it processes faster and more efficiently. Prominent users include Yahoo! and Facebook. In fact more than half Fortune 50 companies were using Hadoop in 2013.

Hadoop – initially launched in December 2011 ? has survived its baptism of fire and became a respected, reliable option. But is this something the average business owner can tackle on their own? Bear in mind that open source software generally comes with little implementation support from the vendor.

The Hadoop Strong Suite

  • Free to download, use and contribute to
  • Everything you need ?in the box? to get started
  • Distributed across multiple fire-walled computers
  • Fast processing of data held in efficient cluster nodes
  • Massive scaleable storage you are unlikely to run out of

Practical Constraints

There is more to Hadoop than writing to WordPress. The most straightforward solutions are uploading using Java commands, obtaining an interface mechanism, or using third party vendor connectors such as ACCESS or SAS. The system does not replace the need for IT support, although it is cheap and exceptionally powerful.

The Not-Free Safer Option

Smaller companies without in-depth in-house support are wise to engage with a technical intermediary. There are companies providing commercial implementations followed by support. Microsoft, Amazon and Google among others all have commercial versions in their catalogues, and support teams at the end of the line.

Check our similar posts

The Better Way of Applying Benford’s Law for Fraud Detection

Applying Benford’s Law on large collections of data is an effective way of detecting fraud. In this article, we?ll introduce you to Benford’s Law, talk about how auditors are employing it in fraud detection, and introduce you to a more effective way of integrating it into an IT solution.

Benford’s Law in a nutshell

Benford’s Law states that certain data sets – including certain accounting numbers – exhibit a non-uniform distribution of first digits. Simply put, if you gather all the first digits (e.g. 8 is the first digit of ?814 and 1 is the first digit of ?1768) of all the numbers that make up one of these data sets, the smallest digits will appear more frequently than the larger ones.

That is, according to Benford’s Law,

1 should comprise roughly 30.1% of all first digits;
2 should be 17.6%;
3 should be 12.5%;
4 should be 9.7%, and so on.

Notice that the 1s (ones) occur far more frequently than the rest. Those who are not familiar with Benford’s Law tend to assume that all digits should be distributed uniformly. So when fraudulent individuals tinker with accounting data, they may end up putting in more 9s or 8s than there actually should be.

Once an accounting data set is found to show a large deviation from this distribution, then auditors move in to make a closer inspection.

Benford’s Law spreadsheets and templates

Because Benford’s Law has been proven to be effective in discovering unnaturally-behaving data sets (such as those manipulated by fraudsters), many auditors have created simple software solutions that apply this law. Most of these solutions, owing to the fact that a large majority of accounting departments use spreadsheets, come in the form of spreadsheet templates.

You can easily find free downloadable spreadsheet templates that apply Benford’s Law as well as simple How-To articles that can help you to implement the law on your own existing spreadsheets. Just Google “Benford’s law template” or “Benford’s law spreadsheet”.

I suggest you try out some of them yourself to get a feel on how they work.

The problem with Benford’s Law when used on spreadsheets

There’s actually another reason why I wanted you to try those spreadsheet templates and How-To’s yourself. I wanted you to see how susceptible these solutions are to trivial errors. Whenever you work on these spreadsheet templates – or your own spreadsheets for that matter – when implementing Benford’s Law, you can commit mistakes when copy-pasting values, specifying ranges, entering formulas, and so on.

Furthermore, some of the data might be located in different spreadsheets, which can likewise by found in different departments and have to be emailed for consolidation. The departments who own this data will have to extract the needed data from their own spreadsheets, transfer them to another spreadsheet, and send them to the person in-charge of consolidation.

These activities can introduce errors as well. That’s why we think that, while Benford’s Law can be an effective tool for detecting fraud, spreadsheet-based working environments can taint the entire fraud detection process.

There?s actually a better IT solution where you can use Benford’s Law.

Why a server-based solution works better

In order to apply Benford’s Law more effectively, you need to use it in an environment that implements better controls than what spreadsheets can offer. What we propose is a server-based system.

In a server-based system, your data is placed in a secure database. People who want to input data or access existing data will have to go through access controls such as login procedures. These systems also have features that log access history so that you can trace who accessed which and when.

If Benford’s Law is integrated into such a system, there would be no need for any error-prone copy-pasting activities because all the data is stored in one place. Thus, fraud detection initiatives can be much faster and more reliable.

You can get more information on this site regarding the disadvantages of spreadsheets. We can also tell you more about the advantages of server application solutions.

Which Services to Share?

It often makes sense to pool resources. Farmers have been doing so for decades by collectively owning expensive combine harvesters. France, Germany, the United Kingdom and Spain have successfully pooled their manufacturing power to take on Boeing with their Airbus. But does this mean that shared services are right in every situation?

The Main Reasons for Sharing

The primary argument is economies of scale. If the Airbus partners each made 25% of the engines their production lines would be shorter and they would collectively need more technicians and tools. The second line of reasoning is that shared processes are more efficient, because there are greater opportunities for standardisation.

Is This the Same as Outsourcing?

Definitely not! If France, Germany, the United Kingdom and Spain has decided to form a collective airline and asked Boeing to build their fleet of aircraft, then they would have outsourced airplane manufacture and lost a strategic industry. This is where the bigger picture comes into play.

The Downside of Sharing

Centralising activities can cause havoc with workflow, and implode decentralised structures that have evolved over time. The Airbus technology called for creative ways to move aircraft fuselages around. In the case of farmers, they had to learn to be patient and accept that they would not always harvest at the optimum time.

Things Best Not Shared

Core business is what brings in the money, and this should be tailor-made to its market. It is also what keeps the company afloat and therefore best kept on board. The core business of the French, German, United Kingdom and Spanish civilian aircraft industry is transporting passengers. This is why they are able to share an aircraft supply chain that spun off into a commercial success story.

Things Best Shared

It follows that activities that are neither core nor place bound – and can therefore happen anywhere ? are the best targets for sharing. Anything processed on a computer can be processed on a remote computer. This is why automated accounting, stock control and human resources are the perfect services to share.

So Case Closed Then?

No, not quite. ?Technology has yet to overtake our humanity, our desire to feel part of the process and our need to feel valued. When an employee, supplier or customer has a problem with our administration it’s just not good enough to abdicate and say ?Oh, you have to speak to Dublin, they do it there?.

Call centres are a good example of abdication from stakeholder care. To an extent, these have ?confiscated? the right of customers to speak to speak directly to their providers. This has cost businesses more customers that they may wish to measure. Sharing services is not about relinquishing the duty to remain in touch. It is simply a more efficient way of managing routine matters.

2015 ESOS Guidelines Chapter 2 – Deadlines and Status Changes

The ESOS process is deadline driven and meeting key dates is a non-negotiable. The penalties for not complying / providing false or misleading information are ?50,000 each. Simply not maintaining adequate records could cost you ?5,000. The carrot on the end of the stick is the financial benefits you stand to gain.

Qualifying for inclusion under the ESOS umbrella depends on the status of your company in terms of employee numbers, turnover and balance sheet on 31 December 2014. Regardless of whether you meet the 2014 threshold or not, you must reconsider your situation on 31 December 2018, 2022 and 2026.

Compliance Period Qualification Date Compliance Period Compliance Date
1 31 December 2014 From 17 July 2014* to 5 December 2015 5 December 2015
2 31 December 2018 From 6 December 2015 to 5 December 2019 5 December 2019
3 31 December 2022 From 6 December 2019 to 5 December 2023 5 December 2023
4 31 December 2026 From 6 December 2023 to 5 December 2027 5 December 2027

Notes:

1. The first compliance period begins on the date the regulations became effective

2. Energy audits from 6 December 2011 onward may go towards the first compliance report

Changes in Organisation Status

If your organisation status changes after a qualification date when you met compliance thresholds, you are still bound to complete your ESOS assessment for that compliance period. This is regardless of any change in size or structure. Your qualification status then remains in force until the next qualification date when you must reconsider it.

Ready to work with Denizon?