Joymon V/S Code: Exposing Parquet file to SQL 2016 as well as Hadoop (Java/Scala)

Tuesday, October 31, 2017

Exposing Parquet file to SQL 2016 as well as Hadoop (Java/Scala)

This is just an architecture post explaining the possibility of Parquet file exposed to SQL 2016 databae via polybase and other applications accessing normally. The other applications can be anything such as data analytics code running in Hadoop cluster.

Mainly this kind of integration needed when we already have an transaction database such as SQL Server and we have to analyze data. Either we can have scheduled data movement using ETL technologies or we can use polybase to move data from an internal normal table to external polybase table which is backed by parquet file. If the solution is in Azure, the parquet file can be somewere in storage. Once the data is there in parquet file format the analytics algorithms can hit the same. Parquet file is mentioned here because of their familiarity in analytics community.

Below goes such an architecture.

Since the architecture may change over the time, LucidChart diagram is embedded. Please comment if this is not working. Thanks to LucidChart for their freemium model.

Details on implementation such as code snippets are good to share in separate post.

No comments:

Subscribe to: Post Comments (Atom)

Label Cloud

.bat file (3)

.Net (44)

.Net 3.5 (13)

.Net 4 (43)

.Net 4.5 (7)

.Net 4.5.2 (9)

Active Directory (1)

Agile (1)

AJAX (1)

Algorithm (7)

Android Development (11)

AngularJS (17)

Apache (1)

Apache Spark (3)

API Design (9)

AppFabric (1)

Application Design (41)

AppVeyor (6)

Arduino (1)

Arithmetic Expression (1)

Array (1)

Artificial Intelligence (3)

ASP.NET (56)

Assembly Language (3)

Async programming (7)

Automation (5)

AWS (3)

Azure (82)

Azure Automation (9)

Azure Development (49)

Azure DevOps (5)

Azure Functions (3)

Azure Security (14)

Basics (9)

Blockchain (1)

Blogging (2)

Bootstrap (1)

Bug (3)

C (1)

C# (85)

C# Keywords (13)

C# V/S VB.Net (4)

CAB (2)

Caching (2)

Career development (14)

Chatbot (1)

Clean Code (1)

Cloud Computing (9)

CLR (7)

Cmdlet (1)

code (1)

Code Quality (2)

CodeDom (1)

COM (2)

Computation (1)

Continuous Integration (12)

Cosmos DB (1)

CSRF (1)

CSS3 (3)

Custom Controls (1)

Database (12)

DataRow (2)

DataTable (1)

Debugging (48)

Decompiler (2)

Delegate (6)

Deno (1)

Dependency Injection (2)

Deployment (5)

Design Time (1)

DevOps (13)

Diagram (1)

Disaster Recovery (2)

Distributed Computing (4)

Docker (10)

Documentation (1)

DOS (2)

DotNet Watcher (1)

DotNet.Helpers (6)

DSL (8)

DSOFile (1)

dynamic (5)

Eclipse (7)

Electron (1)

Enterprise Development (19)

Evernote (2)

Exception (11)

ExpandoObject (1)

Expression (1)

Extensible Chaos Monkey (2)

FaultContract (2)

FaultException (2)

Fiddler (2)

Freemium (5)

Functional Programming (8)

GAC (1)

Garbage Collector (3)

Generics (1)

GitHub (8)

Google (1)

Google App Engine (2)

HackerRank (1)

Hadoop (6)

HDInsight (8)

Heroku (1)

High Availability (4)

Higher Order Function (2)

HTML 5 (14)

HTTP(S) (3)

HttpModule (1)

Hugo (5)

IaaS (1)

IIS (25)

IIS 8 (3)

Installation (3)

Interop (6)

Interview (10)

iOS (1)

IoT (3)

iPad (1)

iPhone (2)

JAMstack (6)

Jargon (1)

Jasmine (1)

Java (6)

JavaScript (42)

Joyful Tools for Visual Studio (3)

jQuery (6)

JSON (2)

Jupyter Notebook (1)

K-MUG (1)

Karel (4)

KendoUI (1)

Keyword (6)

Kubernetes (13)

Kusto Query (1)

Linq (8)

Linux (10)

Load Testing (8)

Log Parser (2)

Logging (20)

Loginless User (1)

Low-code (1)

Lync Programming (2)

MapReduce (2)

Markdown (3)

Memory (4)

Mentoring (14)

Microservices (2)

Microsoft surface (2)

Migration (1)

Mind map (11)

Mocking (5)

Moq (1)

MOSS (1)

MS Office (5)

MSACCESS (2)

MSEXCEL (6)

MVC (19)

MySQL (1)

NAS (12)

Net.TCP (2)

Networking (16)

No-Code (1)

Node.JS (10)

NodeMCU (1)

NPM (3)

nuget (11)

object (3)

Objective-C (1)

Observability (6)

OData (2)

OOP (5)

Open source (1)

Open XML SDK (4)

Orchestration (3)

OWIN (1)

Parquet (3)

Patterns (5)

Perfmon (2)

Performance optimization (18)

Philly.Net (2)

plantUML (10)

Plex (1)

Power BI (4)

Progressive Web Apps (2)

Project Management (4)

PropertyBag (1)

Public API (1)

Puppeteer (1)

Puzzle (8)

Python (4)

QA (2)

R Language (1)

Raspberry Pi (11)

RAZOR (5)

Refactoring (2)

Reflection (8)

RegEx (3)

RegularExpressions (3)

Rest (6)

Reveal.JS (1)

Roslyn (6)

Route (1)

RSS Feed (1)

Rust (1)

SaaS (2)

Scala (4)

Scientific method (4)

Scripting (12)

Security (20)

Seminars and Events (1)

SEO (2)

Serverless (13)

Service Bus (3)

Service Fabric (1)

Service worker (2)

SharePoint (18)

Silverlight (6)

Site Reliability Engineering (1)

Smart contract (1)

Software Architecture (23)

Software Containerization (20)

Software Industry (1)

SOS (1)

sql (30)

SQL Profiler (1)

sql query (21)

sql server (31)

SQL tuning (2)

SQLite (1)

SSAS (1)

SSIS (5)

SSMS (1)

SSRS 2008 (1)

Static Site Generator (1)

Surface IdentityTag (1)

Surface Tag (1)

System Programming (2)

T4 Template (1)

Tail Call Optimization (1)

TDD (4)

Tech talks (1)

Testing (19)

TFS (4)

TFS Programming (4)

Theory (11)

Threading (9)

Tips (61)

TLS (1)

Trace (2)

Training (4)

Transaction (3)

Travis-CI (3)

TypeScript (12)

URL Rewriting (1)

Validation (1)

VB.Net (17)

VC++ (2)

Versioning (1)

Video (2)

Visual J# (1)

Visual Studio (12)

Visual Studio Code (3)

Visual Studio Extensibility (6)

Visualization (1)

VS 2005 (3)

VS 2008 (5)

VS 2010 (18)

VS 2013 (1)

VS 2015 (1)

VSTO (1)

WAS Hosting (1)

WCF (26)

Web Development (34)

Web Security (14)

Web.config (6)

WebAPI (17)

WebAssembly (2)

WebJobs (3)

Webpack (3)

WebParts (1)

WebSockets (1)

WebTask (1)

WebTest (3)

Win32 API (1)

WinDbg (2)

Windows 8 (5)

Windows Azure (17)

Windows container (2)

Windows Forms (1)

Windows Phone (3)

Windows PowerShell (37)

Windows Registry (1)

Windows Service (2)

Windows Workflow Foundation (2)

WinForms (1)

Wireshark (3)

WMI (2)

WPF (4)

WSS (1)

www.JoymonOnline.in (18)

XML (3)