Data Engineering Session
- Data engineering
- building data warehouse, lake
- building piplines
- building consumable layer fo DS
- Data OPS
- monitoriing and support, life cycle mgmt
- ci/cd pipelines, infra automation
- UI/UX
- consumed for apps already there,
- or a new app
to cater out both or any business and technical stakeholders
can be any combination from above
Eg:
mainframe → oracle data warehouse (on-prem) → cloudera hadoop on-prem → aws cloud
BI - PowerBI
Data Engineering → Data Science → UI/UX
Data Ops → onprem
Data consulting
- tech comparisons (redshift vs snowflake)
- system audit → rearchitect?
- understand where they lack, the problem statement
- As-is process vs our solution
- Tool, system, flow, data, integration, scalibitity, compliance, non-functional requirements, model ineffeciency
- can be modernizing the platform
- road map→ target state (advisory, sme)
- pitch in that we worked on similiar problems with customers, presales
- consulting → implementation
key insights: splitting transactional and analytics systems
history of data warehouse technologies
from teradata through Hadoop hbase → cloud (min opx)
sql interface before, after big data, programmatic way
more hw, license, dev costs to less of those