DataTaunew | comments | leaders | submitlogin
1 point by brian_spiering 3134 days ago | link | parent

Focus on the fundamentals: 1) Redundancy - Everything needs to be backed-up. Preferable offsite. Any cloud service provider is good for this. 2) Version control - All those scripts need to be in a DVS (distributed version con). GitHub is good for this. 3) Architecture diagram and a plan - Document where you are and where you want to be. That gives you something concrete to discuss. Define the makers (producers) and users (subscribers) of the data. This doesn't have to be formal or perfect. I have found frequently people have different implicit assumptions that are at odds with each other and the users of the data aren't getting data they could use. Externalizing these assumptions helps resolve that tension. 4) Get a budget. Even if it is just orders of magnitude of how much can be spent in time, people, and money. 5) Don't over engineer or throw newest technology at the problem. Start with the simplest (nonsexy) systems. You probably don't need Spark and friends. I would guess that a RDBMS is going to very helpful very shortly.

If this is truly beyond you and the team, hire help. A little bit of consulting will go a long way.

You have a fun and interesting challenge!




RSS | Announcements