The Big Data Mindset

 

Big data / Hadoop is here today. Mostly because as an open platform with powerful tools to achieve enterprise data platforms cost effectively.  No doubt, that there is hype in rushing to Hadoop implementation with less consideration to user adaption, sustainability and internal staff augmentation. For worthwhile considerations may be:

  • Can Hadoop be the total/end-to-end replacement over traditional relational databases for highly structured data?
  • In a highly structured data situations, can data prep happen on traditional RDBMS and then, denormalized and pushed into Hadoop for heavy duty analytics and consumption?
  • How to calibrate creatively to provide the benefit of RDBMS, where data is highly structured and conforms to traditional dimensional or hierarchical model?
  • Where data volume is not significant, how to compensate performance impact of scanning the nodes for the scant blocks?
  • While Hadoop may be favored and partisan choice for data processing and storage, can conventional front-end tools  such as Cognos, Power BI, Qlik alone work seamless? it is like driving Lamborghini Veneno Roadster on tires from Costco!

As the enterprises embrace big data platform for analytics, the embrace should include besides data platforms, mid and presentation tiers, corporate data consumption and thought patterns.

Consideration
Traditional RDBMS
MapReduce
Data size
Gigabytes
Petabytes
Access
Interactive and batch
Batch
Updates
Read and write many times
Write once, read many time
Transactions
ACID
None
Structure
Schema-on-write
Schema-on-read
Integrity
High
Low
Scaling
Nonlinear
Linear