Powered by Blogger.

Translate

Wednesday, 12 November 2014

Tagged under:

10 Reasons Why Hadoop Is Not The Best Big Data Platform All The Time!


10 Reasons Why Hadoop Is Not The Best Big Data Platform All The Time!  
There are several reasons why Hadoop is not always the best solution for all purposes. Let's discuss ten disadvantages of Hadoop here. 

Tuesday, November 11, 2014 Hadoop has become the backbone of several applications and Big Data cannot be even imagined without Hadoop. Hadoop offers distributed storage, scalability and huge performance. It's also considered as the standard platform for high-volume data infrastructures. But there are several reasons why Hadoop is not always the best solution for all purposes. Let's discuss ten disadvantages of Hadoop here:

Hadoop, Hadoop disadvantages, Hadoop issues, Hadoop problems, Hadoop concerns, Hadoop limitations, Hadoop negativities, 10 reasons why Hadoop should be avoided, Hadoop ten disadvantages




1. Pig vs. Hive:

Hive UDFs are not allowed to be used in Pig. Hcatalog is required to access Hive tables in Pig. Pig UDFs cannot be used in Hive too. If any extra functionality is required in Hive, then a Pig script is always not much preferred.

2. Security concerns:

If Hadoop is used to manage a complex application, then it becomes a huge challenge. Hadoop's security model is not very recommended one and if used in complex applications, it gets disabled by default. Data is at huge risk as encryption is missing in Hadoop system at the storage and network levels. Without encryption, data can always be compromised easily.

3. Big Data cravings:

Hadoop is mostly craved when business is built on a Big data dataset. But before using Hadoop, you need to know answers to certain questions like how much terrabyte of data do you have, if you are having a steady and huge flow of data or not and how much data will be operated upon in reality.

4. Shared libraries forcefully stored in HDFS:

Hadoop keeps repeating this issue. If Pig script is stored in HDFS, then it's assumed that there will be JAR files too. This theme recurs in Oozie and other tools too. Well, storing shared libraries in HDFS is not that much a bad idea, but if it's to be done across a huge organisation, then the task is painful.

5. Vulnerable by nature:

Hadoop is always risky when it comes to security concerns. The framework of Hadoop is written in Java, the programming language known for its popularity to be the most vulnerable one among cyber criminals. It means Hadoop is quite vulnerable to data breaches automatically.

6. Oozie:

Debugging is not a funny job. If there is an error, it doesn't always mean you have done something wrong. It can also be a protocol error which arrives in case of a configuration typo or a schema validation error. These kinds of errors fail on the server. In these cases, Oozie is always not of much help, if not distributed properly.

7. Unsuitable for small data:

Big Data doesn't always mean big businesses. Big Data platforms are also not suited for small data needs always. Hadoop is one such platform which is not at all compatible with small data. It has high capacity design and its Hadoop Distributed File System or HDFS cannot read small files randomly. Hence, Hadoop is not the best solution for organisations which deal with small amount of data.

8. Stability issues:

Hadoop, being an open source platform, means it has been developed by several contributors, who are still working on the project. There are always some new improvements, like any other open source software. Hadoop has its stability issues to a huge extent. Organisations are advised to run the latest versions of Hadoop to avoid these kind of stability issues.

9. Documentation:

Documentation of Hadoop system is not very refined as there are several errors in the same. Shared examples are not always checked, which lead to mistakes. The most formidable part is the documentation for Oozie, as its examples don't even pass the schema validation.

10. Repository management:

If you have done any installation from the Hadoop repositories then you must know that the repositories don't act properly all the time, as they are mismanaged. It doesn't even check compatibility all the time while installing any new application.

0 comments:

Post a Comment