DataTaunew | comments | leaders | submitlogin
1 point by etrabesi 2601 days ago | link | parent

It seems like a nice start, i think the stuff most lacking everywhere is errors,tools and redshift's limitation

I would add on the following tools: - Redshift monitoring https://github.com/awslabs/amazon-redshift-monitoring - Redshift admin queies and views https://github.com/awslabs/amazon-redshift-utils/tree/master... and https://github.com/awslabs/amazon-redshift-utils/tree/master... - spark-redshift https://github.com/databricks/spark-redshift - snowplow https://github.com/snowplow/snowplow

I would also add about limitation and how to overcome it: - copy of json vs csv (Non Strict/Strict schema) - limitation of udf (no input possible) - listagg on more than varchar(max) using ... - some redshift sql tricks like (equivalent to generate_series) https://github.com/eyaltrabelsi/my-notebooks/blob/master/tot... - Emphasize lack of recursive functions& triggers and more and how one would fix it using code other tools

ERROR handling: - SSL - load errors

and some more advanced stuff like: - dynamic schemas since data+quries alive how would one generate schemas - wlm and the new features - auto scaling the cluster when/how few slices with big storage vs many slices (Network IO vs parallelize)



1 point by gps13 2580 days ago | link

thanks a lot for the feedback. WE will try to ament accordingly.

-----




RSS | Announcements