Anybody who’s worked on a data warehouse knows the date dimension is key. Designing it is complex and filling it might be tricky and time consuming. In this article I simply share SQL scripts I came up with to fill the date dimension in a PostgreSQL data warehouse. Continue reading Scripts to populate the date dimension in a PostgreSQL data warehouse
Category: Blog posts
3 must-know Talend best practices
In this post I quickly go through 3 best practices you should know before you start using Talend. They might seem a bit abstract at first, but they will save you a lot of time in the (not so) long run.
Bass Transcription of I Need A Dollar – Aloe Blacc
I Need A Dollar is the song that really made Aloe Blacc known to the general public. In my opinion, the most recognisable instruments on the track are the bass and the brass section. Here is the bass transcription…
Continue reading Bass Transcription of I Need A Dollar – Aloe Blacc
Bass Transcription of Magdalene – Lenny Kravitz
Magdalene is my all-time favourite song from Kravitz. It was released in 1995 on the album Circus. When I started playing bass, I could not find the transcription anywhere. A few years later, it became one of the first songs I transcribed.
Continue reading Bass Transcription of Magdalene – Lenny Kravitz
Dealing with non-primitive types with Talend tPostgresqlOuput component
Batch entry 0 INSERT INTO "public"."log" ("source","logged") VALUES ('123.45.56.78','2017-06-21 13:52:23') was aborted. Call getNextException to see the cause.
Ever bump into this kind of exception? This issue drove me crazy, and it took me a good few hours to figure out how to deal with it. The solution is actually very simple… once you know it!
Continue reading Dealing with non-primitive types with Talend tPostgresqlOuput component
Why JSON fields shouldn’t be used in relational databases
What do PostgreSQL 9.2, Oracle 12c and MySQL 5.7.8 have in common? They all integrated JSON as a possible data type. Pretty cool, huh?! Actually, I think using JSON in a relational database is one of the worst ideas you can have. Let me explain why.
Continue reading Why JSON fields shouldn’t be used in relational databases
Using Talend tRowGenerator to populate tables with foreign key constraints
Many tutorials show how to use the tGeneratorRow component, but most of them (if not all of them) consider ad-hoc examples. Therefore, they’re often hard to apply in real-world situations. In this post I’m going to explain how to populate a table referencing another table.
Continue reading Using Talend tRowGenerator to populate tables with foreign key constraints
Paper accepted at IRI 2016!
The paper Improving the Utility of Anonymized Datasets through Dynamic Evaluation of Generalization Hierarchies has been accepted for publications at the IEEE 17th International Conference on Information Reuse and Integration.
The book Big Data: Storage, Sharing, and Security is out!
The title says it all: the book Big Data: Storage, Sharing, and Security is out. Even better, you can buy it here!
Continue reading The book Big Data: Storage, Sharing, and Security is out!