Quantcast
Channel: Data Management & Data Architecture
Browsing all 22 articles
Browse latest View live

Log-based Change Data Capture - lessons learnt

My article on medium summarizes experiences from various projects with log-based change data capture (CDC). There are many use cases for which CDC is beneficial. Some DBs even have CDC functionality...

View Article


Anonymization techniques and data privacy

Anonymization techniques are essential for data analytics or in test/dev databases. Anonymization and pseudonymization are very different but often confused. GDPR does not apply to anonymized data...

View Article


Image may be NSFW.
Clik here to view.

PostgreSQL partitioning guide

PostgreSQL partitioning is a powerful feature when dealing with huge tables. Partitioning allows breaking a table into smaller chunks, aka partitions. Logically, there seems to be one table only if...

View Article

Image may be NSFW.
Clik here to view.

PostgreSQL columnar extension cstore_fdw

PostgreSQL columnar extension cstore_fdw is a storage extension which is suited for OLAP-/DWH-style queries and data-intense applications. Columnar analytical databases have unique characteristics...

View Article

Image may be NSFW.
Clik here to view.

PostgreSQL application_name

PostgreSQL application_name can be set in the connection string. The view pg_stat_activity will show the application_name to help to identify the sessions. The article shows how to set...

View Article


Image may be NSFW.
Clik here to view.

Data Engineering with dbt – first steps using PostgreSQL and Oracle

dbt is a Data Engineering tool supporting version control with CI/CD for transformations and materialization. The approach with dbt differs from tools like SSIS, DataFactory, Informatica.  The...

View Article

Image may be NSFW.
Clik here to view.

Materialization examples of Data Engineering with dbt

dbt offers several materialization options to create ETL/ELT processes. The article shows and compares various approaches how to use dbt for ETL/ELT. A previous post contains an introduction into dbt:...

View Article

Image may be NSFW.
Clik here to view.

Data Vault and Star Schema with PlantUML: Entity Relationship Diagram as Code

Entity Relationship Diagram as code means developers use the same tools for creating the diagrams – or documentation in general – as for coding. Documentation includes more than just source code and...

View Article


Image may be NSFW.
Clik here to view.

Predictions about data for 2023 and beyond

Predictions about data for 2023 and beyond. End of the year: it’s the time for predictions. Let’s have a look at some predictions regarding data. There are many predictions for Machine Learning, Deep...

View Article


Data visualization with Flourish

Flourish is a data visualization and storytelling platform that helps data enthusiasts understand and communicate complex data. With a wide range of customizable templates and interactive features,...

View Article

How to Be Useful: Unpacking Arnold Schwarzenegger’s Secrets to Success

Did you know that the man who conquered bodybuilding, Hollywood, and the political arena believes that his multifaceted success boils down to just seven principles? Yes, Arnold Schwarzenegger, in his...

View Article

Image may be NSFW.
Clik here to view.

Vector Database – What, Why, and How

In today’s data-driven world, vector databases are available to handle complex, high-dimensional data. This article describes vector databases including use cases as well as an example with the...

View Article

Image may be NSFW.
Clik here to view.

Similarity search in vector databases: a comprehensive guide

Similarity search in vector databases has emerged as a pivotal technique enabling efficient retrieval of information by comparing complex data points within high-dimensional spaces. The ability to...

View Article


Image may be NSFW.
Clik here to view.

Vector Indexes in Vector Databases: Semantic Search Performance

Vector indexes are crucial for semantic search performance, optimizing efficient querying. In this article, I will delve into various types of vector indexes, their workings, pros and cons, and...

View Article

Image may be NSFW.
Clik here to view.

Oracle AI Vector – Semantic Search

With the advent of Large Language Models (LLM), vector databases are becoming increasingly popular. Vector databases and similar approaches have existed for a long time such as geodata have long been...

View Article


Vector Vanguard: Tracking the Pulse of Vector Tech 07/2024

Welcome to “Vector Vanguard: Tracking the Pulse of Vector Tech 07/2024” – a source for the latest developments in vector databases, vector indexes, RAG (Retrieval-Augmented Generation), similarity...

View Article

Vector Vanguard: Tracking the Pulse of Vector Tech 08/2024

Welcome to “Vector Vanguard: Tracking the Pulse of Vector Tech 08/2024” – a source for the latest developments in vector databases, vector indexes, RAG (Retrieval-Augmented Generation), similarity...

View Article


SQL’s unstoppable evolution: DBMS Innovations and how relational DBs...

Michael Stonebraker and Andrew Pavlo wrote about DBMS innovations in their paper “What Goes Around Comes Around… And Around…” which revisits the evolution of data models and database systems over the...

View Article

Vector Vanguard: Tracking the Pulse of Vector Tech 09/2024

Welcome to “Vector Vanguard: Tracking the Pulse of Vector Tech 09/2024” – a source for the latest developments in vector databases, vector indexes, RAG (Retrieval-Augmented Generation), similarity...

View Article

Image may be NSFW.
Clik here to view.

Humanizing Data Strategy: A concise summary of Tiankai Feng’s 5 Cs Framework

“Humanizing Data Strategy: Leading Data with the Head and the Heart” by Tiankai Feng focuses on a people-centered approach to data strategy. The book introduces the Five Cs Framework, which highlights...

View Article
Browsing all 22 articles
Browse latest View live