Skip to main content
Home
  • Tutorials
    • Quality Assurance
    • Software Development
    • Machine Learning
    • Data Science
  • About Us
  • Contact
programsbuzz facebook programsbuzz twitter programsbuzz linkedin
  • Log in

Main navigation

  • Tutorials
    • Quality Assurance
    • Software Development
    • Machine Learning
    • Data Science
  • About Us
  • Contact

Challenges of Apache Hadoop

Profile picture for user shiksha.dahiya
Written by shiksha.dahiya on 07/05/2021 - 21:34

Hadoop is a complex distributed system with low-level APIs. Specialized skills are required for using Hadoop, preventing most developers from effectively building solutions. Business logic and infrastructure APIs have no clear separation, burdening app developers. Automated testing of end-to-end solutions is impractical or impossible. Hadoop is a diverse collection of many open source projects. Understanding multiple technologies and hand-coding integration between them. Significant effort is wasted on simple tasks like data ingestions and ETL. Moving from proof-of-concept to production is difficult and can take months or quarters. 

Hadoop is more than just offline storage and batch analytics. Different processing paradigms require data to be stored in specific ways. Real-time and batch ingestion requires deeply integrating several components. Common data patterns often require but don’t support data consistency and correctness

1.) Hadoop is a cutting edge technology

Hadoop is a new technology, and as with adopting any new technology, finding people who know the technology is difficult!

2.)  Hadoop in the Enterprise Ecosystem

Hadoop is designed to solve Big Data problems encountered by Web and Social companies. In doing so a lot of the features Enterprises need or want are put on the back burner. For example, HDFS does not offer native support for security and authentication.

3.) Hadoop is still rough around the edges

The development and admin tools for Hadoop are still pretty new. Companies like Cloudera, Hortonworks, MapR and Karmasphere have been working on this issue. How ever the tooling may not be as mature as Enterprises are used to (as say, Oracle Admin, etc.)

4.) Hadoop is NOT cheap

  1. Hardware Cost : Hadoop runs on 'commodity' hardware. But these are not cheap machines, they are server grade hardware. Hardware and Software for Hadoop, So standing up a reasonably large Hadoop cluster, say 100 nodes, will cost a significant amount of money. 
  2. IT and Operations costs : A large Hadoop cluster will require support from various teams like : Network Admins, IT, Security Admins, System Admins. Also one needs to think about operational costs like Data Center expenses : cooling, electricity, etc 
Related Content
Apache Hadoop Tutorial
Introduction to Apache Hadoop
History of Apache Hadoop
Tags
Apache Hadoop
  • Log in or register to post comments

Choose Your Technology

  1. Agile
  2. Apache Groovy
  3. Apache Hadoop
  4. Apache HBase
  5. Apache Spark
  6. Appium
  7. AutoIt
  8. AWS
  9. Behat
  10. Cucumber Java
  11. Cypress
  12. DBMS
  13. Drupal
  14. GitHub
  15. GitLab
  16. GoLang
  17. Gradle
  18. HTML
  19. ISTQB Foundation
  20. Java
  21. JavaScript
  22. JMeter
  23. JUnit
  24. Karate
  25. Kotlin
  26. LoadRunner
  27. matplotlib
  28. MongoDB
  29. MS SQL Server
  30. MySQL
  31. Nightwatch JS
  32. PactumJS
  33. PHP
  34. Playwright
  35. Playwright Java
  36. Playwright Python
  37. Postman
  38. Project Management
  39. Protractor
  40. PyDev
  41. Python
  42. Python NumPy
  43. Python Pandas
  44. Python Seaborn
  45. R Language
  46. REST Assured
  47. Ruby
  48. Selenide
© Copyright By iVagus Services Pvt. Ltd. 2023. All Rights Reserved.

Footer

  • Cookie Policy
  • Privacy Policy
  • Terms of Use