Datalogz Security
Introduction
This document provides an in-depth look at Datalogz, a SaaS-based, read-only metadata application, detailing how it interfaces with various BI tools, handles authentication, and ensures data security.
Overview of Datalogz SaaS Solution
Datalogz is a specialized tool designed to administer BI environments effectively through the analysis of non-proprietary metadata. It operates exclusively in a SaaS environment, ensuring enhanced efficiency and security.
Metadata Read from BI Tools
BI Tool
Data Connected
Authentication
Notes
Tableau
Personal Access Token
Datalogz supports connecting to the Tableau Cloud or Server API by Service Account and Personal Access Token with read-only metadata access.
Qlik
Datalogz integrates with Qlik View through its APIs, offering read-only metadata access. This allows Datalogz to analyze and administer the BI environment by accessing Qlik View dashboards and reports. The integration supports both Qlik View Server and Desktop versions.
Power BI
Service Principal
Datalogz requires read-only access to PowerBI's administrative APIs for a single PowerBI tenant.
Spotfire
API Key / OAuth Token
Datalogz connects to TIBCO Spotfire using its APIs, enabling read-only metadata access. It supports both cloud and on-premises deployments of Spotfire. The integration allows for effective analysis and administration of the BI environment by accessing Spotfire dashboards, reports, and data visualizations. Authentication is typically through an API Key or OAuth token, ensuring secure and restricted access.
Connector Pipeline
The Datalogz application uses Apache Airflow for connector management, providing BI Admins with pre-built metadata pipelines they can choose to schedule daily, weekly or hourly basis. New alerts will be generated after each connector refresh based on the latest data that has changed.
Your connectors will retrieve metadata from the following API endpoints:
Connectors must be configured by BI Admins to approve the Datalogz application. This will provide read-only access to standard and admin-level APIs based on a selection of Groups. Groups are generally defined as follows for each system:
PowerBI: Workspaces
Tableau: Projects
Qlik: Streams
The admin-level APIs unlock the most insight for your BI Admins when it comes to types of Issues and Recommendations Datalogz is able to provide. After a new connector is created, BI Admins can use Datalogz RBAC to assign fine-grained permissions to Users who should only have access to certain metadata from certain Groups.
Authentication and Security Measures
Secrets used to authenticate to the BI Metadata APIs are encrypted and stored in a managed-identity Azure Key Vault within a private subnet of the Datalogz private virtual network. Only the Datalogz backend virtual machine used for running data extraction pipelines has network access and adequate privileges to use these secrets for authentication.
Datalogz as a "Read Only" Metadata Application
Datalogz is able to provide BI Ops insights and recommendations using metadata about the existence of BI reports, users, activities, and other BI assets. Datalogz does not require any data access to generate these recommendations, so we do not require this level of permission to be granted to service principal credentials or personal access tokens. This means Datalogz does not have access to query any data that is used in BI reports. Datalogz only requires access to metadata about the nature of those BI reports, such as the title, description, configuration, asset lineage, usage patterns, refresh durations, successful uptime, and governance features. From this information, the Datalogz recommendations are generated without data access, which can also be described as “read-only access to metadata only.”
Comparison with On-Prem Deployment
Compared to the On-Prem Deployment, a SaaS Deployment has the following benefits:
Datalogz monitors and maintains all infrastructure required to run the Datalogz BI Ops platform.
Datalogz monitors and maintains all services, code, and images required to run the backend and frontend services.
Datalogz upgrades, tests, and deploys all new versions of Datalogz to give you a seamless experience between versions.
No resources on your end are required to commit to the following activities that are part of an on-prem deployment engagement:
Monitoring and maintaining the infrastructure.
Monitor and maintain the services, code, and images required to run the services.
Upgrading and deploying new versions of Datalogz using Docker Desktop and some Windows or Linux commands.
Architecture Diagram
Resource Group: rg-biops-prod-eastus2-001
Resources:
Virtual Machines:
Frontend VM: Hosting services running in containers using Docker Compose behind an nginx web server in the public subnet.
Backend VM: Hosting backend services (ELT API, Fast API, etc.) in containers using Docker Compose behind an nginx web server in the private subnet.
Virtual Network:
Public Subnet for frontend services.
Private Subnet for backend services.
Databases:
Azure PostgreSQL database for operational data.
Snowflake data warehousing service for BI metadata ingestion and analysis using Azure Storage Integration and Snowflake Network Policy.
Storage:
File Storage configured for secure data ingestion via Snowflake External Stage using Azure blob storage for staging data before loading it into Snowflake.
Key Vault:
Key Vault for secure storage of keys and passwords, with Managed Identity access exclusively from the backend VM.
Last updated