Translate

Total Pageviews

Showing posts with label SQL Server. Show all posts
Showing posts with label SQL Server. Show all posts

SQL Server Architecture Explained: Named Pipes, Optimizer, Buffer Manager

 MS SQL Server is a client-server architecture. MS SQL Server process starts with the client application sending a request. The SQL Server accepts, processes and replies to the request with processed data. Let's discuss in detail the entire architecture shown below:

As the below Diagram depicts there are three major components in SQL Server Architecture:

  1. Protocol Layer
  2. Relational Engine
  3. Storage Engine
SQL Server Architecture Diagram

Let's discuss in detail about all the three above major modules. In this tutorial, you will learn.

Protocol Layer - SNI

MS SQL SERVER PROTOCOL LAYER supports 3 Type of Client Server Architecture. We will start with "Three Type of Client Server Architecture" which MS SQL Server supports.

Shared Memory

Let's reconsider an early morning Conversation scenario.

MOM and TOM - Here Tom and his Mom, were at the same logical place, i.e. at their home. Tom was able to ask for Coffee and Mom was able it serve it hot.

MS SQL SERVER - Here MS SQL server provides SHARED MEMORY PROTOCOL. Here CLIENT and MS SQL server run on the same machine. Both can communicate via Shared Memory protocol.

Analogy: Lets map entities in the above two scenarios. We can easily map Tom to Client, Mom to SQL server, Home to Machine, and Verbal Communication to Shared Memory Protocol.

From the desk of configuration and installation:

For Connection to Local DB – In SQL Management Studio, "Server Name" Option could be

"."

"localhost"

"127.0.0.1"

"Machine\Instance"

TCP/IP

Now consider in the evening, Tom is in the party mood. He wants a Coffee ordered from a well-known Coffee Shop. The Coffee shop is located 10 km away from his home.

Here Tom and Starbuck are in different physical location. Tom at home and Starbucks at the busy marketplace. They're communicating via Cellular network. Similarly, MS SQL SERVER provides the capability to interact via TCP/IP protocol, where CLIENT and MS SQL Server are remote to each other and installed on a separate machine.

Analogy: Lets map entities in the above two scenarios. We can easily map Tom to Client, Starbuck to SQL server, the Home/Market place to Remote location and finally Cellular network to TCP/IP protocol.

Notes from the desk of Configuration/installation:

  • In SQL Management Studio - For Connection via TCP\IP, "Server Name" Option has to be "Machine\Instance of the server."
  • SQL server uses port 1433 in TCP/IP.

Named Pipes

Now finally at night, Tom wanted to have a light green tea which her neighbor, Sierra prepare very well.

Here Tom and his Neighbor, Sierra, are in same physical location, being each other's neighbor. They're communicating via Intra network. Similarly, MS SQL SERVER provides the capability to interact via the Named Pipe protocol. Here the CLIENT and MS SQL SERVER are in connection via LAN.

Analogy: Lets map entities in the above two scenarios. We can easily map Tom to Client, Sierra to SQL server, Neighbor to LAN and finally Intra network to Named Pipe Protocol.

Notes from the desk of Configuration/installation:

  • For Connection via Named Pipe. This option is disabled by default and needs to be enabled by the SQL Configuration Manager.

What is TDS?

Now that we know that there are three types of Client-Server Architecture, lets us have a glance at TDS:

  • TDS stands for Tabular Data Stream.
  • All 3 protocols use TDS packets. TDS is encapsulated in Network packets. This enables data transfer from the client machine to the server machine.
  • TDS was first developed by Sybase and is now Owned by Microsoft

Relational Engine

The Relational Engine is also known as the Query Processor. It has the SQL Server components that determine what exactly a query needs to do and how it can be done best. It is responsible for the execution of user queries by requesting data from the storage engine and processing the results that are returned.

As depicted in the Architectural Diagram there are 3 major components of the Relational Engine. Let's study the components in detail:

CMD Parser

Data once received from Protocol Layer is then passed to Relational Engine. "CMD Parser" is the first component of Relational Engine to receive the Query data. The principal job of CMD Parser is to check the query for Syntactic and Semantic error. Finally, it generates a Query Tree. Let's discuss in detail.

Syntactic check:

  • Like every other Programming language, MS SQL also has the predefined set of Keywords. Also, SQL Server has its own grammar which SQL server understands.
  • SELECT, INSERT, UPDATE, and many others belong to MS SQL predefined Keyword lists.
  • CMD Parser does syntactic check. If users' input does not follow these language syntax or grammar rules, it returns an error.

Example: Let's say a Russian went to a Japanese restaurant. He orders fast food in the Russian language. Unfortunately, the waiter only understands Japanese. What would be the most obvious result?

The Answer is – the waiter is unable to process the order further.

There should not be any deviation in Grammar or language which SQL server accepts. If there are, SQL server cannot process it and hence will return an error message.

We will learn about MS SQL query more in upcoming tutorials. Yet, consider below most basic Query Syntax as

SELECT * from <TABLE_NAME>;

Now, to get the perception of what syntactic does, say if the user runs the basic query as below:

SELECR * from <TABLE_NAME> 

Note that instead of 'SELECT' user typed "SELECR."

Result: THE CMD Parser will parse this statement and will throw the error message. As "SELECR" does not follow the predefined keyword name and grammar. Here CMD Parser was expecting "SELECT."

Semantic check:

  • This is performed by Normalizer.
  • In its simplest form, it checks whether Column name, Table name being queried exist in Schema. And if it exists, bind it to Query. This is also known as Binding.
  • Complexity increases when user queries contain VIEW. Normalizer performs the replacement with the internally stored view definition and much more.

Let's understand this with help of below example -

SELECT * from USER_ID

Result: THE CMD Parser will parse this statement for Semantic check. The parser will throw an error message as Normalizer will not find the requested table (USER_ID) as it does not exist.

Create Query Tree:

  • This step generates different execution tree in which query can be run.
  • Note that, all the different trees have the same desired output.

Optimizer

The work of the optimizer is to create an execution plan for the user's query. This is the plan that will determine how the user query will be executed.

Note that not all queries are optimized. Optimization is done for DML (Data Modification Language) commands like SELECT, INSERT, DELETE, and UPDATE. Such queries are first marked then send to the optimizer. DDL commands like CREATE and ALTER are not optimized, but they are instead compiled into an internal form. The query cost is calculated based on factors like CPU usage, Memory usage, and Input/ Output needs.

Optimizer's role is to find the cheapest, not the best, cost-effective execution plan.

Before we Jump into more technical detail of Optimizer consider below real-life example:

Example:

Let's say, you want to open an online Bank account. You already know about one Bank which takes a maximum of 2 Days to open an account. But, you also have a list of 20 other banks, which may or may not take less than 2 days. You can start engaging with these banks to determine which banks take less than 2 days. Now, you may not find a bank which takes less than 2 Days, and there is additional time lost due to the search activity itself. It would have been better to open an account with the first bank itself.

Conclusion: It's is more important to select wisely. To be precise, choose which option is best, not the cheapest.

Similarly, MS SQL Optimizer works on inbuilt exhaustive/heuristic algorithms. The goal is to minimize query run time. All the Optimizer algorithms are propriety of Microsoft and a secret. Although, below are the high-level steps performed by MS SQL Optimizer. Searches of Optimization follows three phases as shown in the below diagram:

Phase 0: Search for Trivial Plan:

  • This is also known as Pre-optimization stage.
  • For some cases, there could be only one practical, workable plan, known as a trivial plan. There is no need for creating an optimized plan. The reason is, searching more would result in finding the same run time execution plan. That too with the extra cost of Searching for optimized Plan which was not required at all.
  • If no Trivial plan found, then 1st Phase starts.

Phase 1: Search for Transaction processing plans

  • This includes the search for Simple and Complex Plan.
  • Simple Plan Search: Past Data of column and Index involved in Query, will be used for Statistical Analysis. This usually consists but not restricted to one Index Per table.
  • Still, if the simple plan is not found, then more complex Plan is searched. It involves Multiple Index per table.

Phase 2: Parallel Processing and Optimization.

  • If none of the above strategies work, Optimizer searches for Parallel Processing possibilities. This depends on the Machine's processing capabilities and configuration.
  • If that is still not possible, then the final optimization phase starts. Now, the final optimization aim is finding all other possible options for executing the query in the best way. Final optimization phase Algorithms are Microsoft Propriety.

Query Executor

Query executer calls Access Method. It provides an execution plan for data fetching logic required for execution. Once data is received from Storage Engine, the result gets published to the Protocol layer. Finally, data is sent to the end user.

Storage Engine

The work of the Storage Engine is to store data in a storage system like Disk or SAN and retrieve the data when needed. Before we deep dive into Storage engine, let's have a look at how data is stored in Database and type of files available.

Data File and Extent:

Data File, physically stores data in the form of data pages, with each data page having a size of 8KB, forming the smallest storage unit in SQL Server. These data pages are logically grouped to form extents. No object is assigned a page in SQL Server.

The maintenance of the object is done via extents. The page has a section called the Page Header with a size of 96 bytes, carrying the metadata information about the page like the Page Type, Page Number, Size of Used Space, Size of Free Space, and Pointer to the next page and previous page, etc.

File types

  1. Primary file
  • Every database contains one Primary file.
  • This store all important data related to tables, views, Triggers, etc.
  • Extension is .mdf usually but can be of any extension.
  1. Secondary file
  • Database may or may not contains multiple Secondary files.
  • This is optional and contain user-specific data.
  • Extension is .ndf usually but can be of any extension.
  1. Log file
  • Also known as Write ahead logs.
  • Extension is .ldf
  • Used for Transaction Management.
  • This is used to recover from any unwanted instances. Perform important task of Rollback to uncommitted transactions.

Storage Engine has 3 components; let's look into them in detail.

Access Method

It acts as an interface between query executor and Buffer Manager/Transaction Logs.

Access Method itself does not do any execution.

The first action is to determine whether the query is:

  1. Select Statement (DDL)
  2. Non- Select Statement (DDL & DML)

Depending upon the result, the Access Method takes the following steps:

  1. If the query is DDL, SELECT statement, the query is pass to the Buffer Manager for further processing.
  2. And if query if DDL, NON-SELECT statement, the query is pass to Transaction Manager. This mostly includes the UPDATE statement.

Buffer Manager

Buffer manager manages core functions for modules below:

  • Plan Cache
  • Data Parsing: Buffer cache & Data storage
  • Dirty Page

We will learn Plan, Buffer and Data cache in this section. We will cover Dirty pages in the Transaction section.

Plan Cache

  • Existing Query plan: The buffer manager checks if the execution plan is there in the stored Plan Cache. If Yes, then query plan cache and its associated data cache is used.
  • First time Cache plan: Where does existing Plan cache come from?

    If the first-time query execution plan is being run and is complex, it makes sense to store it in in the Plane cache. This will ensure faster availability when the next time SQL server gets the same query. So, it's nothing else but the query itself which Plan execution is being stored if it is being run for the first time.

Data Parsing: Buffer cache & Data Storage

Buffer manager provides access to the data required. Below two approaches are possible depending upon whether data exist in the data cache or not:

Buffer Cache - Soft Parsing:

Buffer Manager looks for Data in Buffer in Data cache. If present, then this Data is used by Query Executor. This improves the performance as the number of I/O operation is reduced when fetching data from the cache as compared to fetching data from Data storage.

Data Storage - Hard Parsing:

If data is not present in Buffer Manager than required Data is searched in Data Storage. If also stores data in the data cache for future use.

Dirty Page

It is stored as a processing logic of Transaction Manager. We will learn in detail in Transaction Manager section.

Transaction Manager

Transaction Manager is invoked when access method determines that Query is a Non-Select statement.

Log Manager

  • Log Manager keeps a track of all updates done in the system via logs in Transaction Logs.
  • Logs have Logs Sequence Number with the Transaction ID and Data Modification Record.
  • This is used for keeping track of Transaction Committed and Transaction Rollback.

Lock Manager

  • During Transaction, the associated data in Data Storage is in the Lock state. This process is handled by Lock Manager.
  • This process ensures data consistency and isolation. Also known as ACID properties.

Execution Process

  • Log Manager start logging and Lock Manager locks the associated data.
  • Data's copy is maintained in the Buffer cache.
  • Copy of data supposed to be updated is maintained in Log buffer and all the events updates data in Data buffer.
  • Pages which store the data is also known as Dirty Pages.
  • Checkpoint and Write-Ahead Logging: This process run and mark all the page from Dirty Pages to Disk, but the page remains in the cache. Frequency is approximately 1 run per minute. But the page is first pushed to Data page of the log file from Buffer log. This is known as Write-Ahead Logging.
  • Lazy Writer: The Dirty page can remain in memory. When SQL server observes a huge load and Buffer memory is needed for a new transaction, it frees up Dirty Pages from the cache. It operates on LRU – Least recently used Algorithm for cleaning pages from buffer pool to disk.

Summary:

  • Three Type of Client-Server Architecture exist: 1) Shared Memory 2) TCP/IP 3)Named Pipes
  • TDS, developed by Sybase and now owned by Microsoft, is a packet that is encapsulated in Network packets for data transfer from the client machine to the server machine.
  • Relational Engine contains three major components:

    CMD Parser: This is responsible for Syntactic and Semantic error & finally generate a Query Tree.

    Optimizer: Optimizer role is to find the cheapest, not the best, cost-effective execution plan.

    Query Executor: Query executer calls Access Method and provides execution plan for data fetching logic required for execution.

  • Three types of files exist Primary file, Secondary file, and Log files.
  • Storage Engine: Has the following important components

    Access Method: This Component Determine whether the query is a Select or Non-Select Statement. Invokes Buffer and Transfer Manager accordingly.

    Buffer Manager: Buffer manager manages core functions for Plan Cache, Data Parsing & Dirty Page.

    Transaction Manager: It manager Non-Select Transaction with help of Log and Lock Managers. Also, facilitates important implementation of Write-head logging and Lazy writers.

How to Download and Install SQL Server

 Pre-Requisites

Principally, MS SQL server requires:

.Net Framework,1GB of recommended memory, and NTFS system.

How to download SQL Server Setup

Step 1) Go to URL: https://www.microsoft.com/en-in/sql-server/sql-server-downloads

Microsoft provides two specialized free editions to work on MS SQL server:

  1. Developer – It has all feature which MS SQL server offers but we cannot use it in production. From the learning perspective, is it an ideal candidate to start.
  2. Express: This is also a free version but with the limited set of features with no business intelligence applications.

We will select the Developer edition for installation.

Step 2) Click on "Download now"

We will get set up as 'SQLServer2017-SSEI-Dev.exe'.

How to Install SQL Server

Step 1) Double click on "SQLServer2017-SSEI-Dev.exe". Below screen will appear with three options: Basic, Custom and Download files.

Step 2) Choose the basic version by clicking on the 'Basic' option, as it has all default configuration required to learn MS SQL.

Step 3) 'Microsoft Server License Terms' screen will appear. Read the License Terms and then click 'Accept.'

Step 4) Below 'SQL server install location' window will appear.

  1. The Default location is C:\Program Files\Microsoft SQL Server.
  2. Optionally, we can also change the installation location by clicking on Browse.3. Once the location is selected, click the 'Install' button.

Below 'Downloading install package' progress screen will be displayed. Wait until the download is complete.

Once, the download is complete; the system will initiate installing developer edition.

Below screen show installation progress.

\

Step 5) Once installation is completed successfully, below screen will appear.

This setup is self-sufficient for proceeding further with learning SQL server, and we can 'Close' this window.

However, below is a summary of the label and button:

  1. Instance name: This is by default labeled as MSSQLSERVER.
  2. Connect now: This will open a separate command line window for connection testing of what we have just installed.

    The system will run by default 'select @@Version' statement to confirm that we can connect to new MSSQLSERVER instance successfully.

    1. Customize: This will open the SQL Installation center to customize further and add features other than which are there as a part of the BASIC installation.
    2. Install SSMS: This is IDE that will take us to Microsoft SSMS download link. We will cover SSMS in detail in our SSMS tutorial.
    3. Close: This will close this window. The user is now ready to install SSMS IDE as instructed in SSMS tutorial.

What is SQL Server? Introduction, Version History

 What is SQL Server?

SQL SERVER is a relational database management system (RDBMS) developed by Microsoft. It is primarily designed and developed to compete with MySQL and Oracle database.

SQL Server supports ANSI SQL, which is the standard SQL (Structured Query Language) language. However, SQL Server comes with its own implementation of the SQL language, T-SQL (Transact-SQL).

T-SQL is a Microsoft propriety Language known as Transact-SQL. It provides further capabilities of declaring variable, exception handling, stored procedure, etc.

SQL Server Management Studio (SSMS) is the main interface tool for SQL Server, and it supports both 32-bit and 64-bit environments.

In this tutorial, you will learn.

Version History SQL Server

  • Microsoft and Sybase released version 1.0 in 1989.
  • However, the partnership between these two ended in the early 1990s.
  • Microsoft maintained ownership rights to the name SQL Server.
  • Since the 1990s, subsequent versions of SQL Server have been released including SQL Server 2000, 2005, 2008, 2012, 2014, 2016, 2017, and 2019

SQL Server Editions

Following editions are available

SQL Server Enterprise: It is used in the high end, large scale and mission Critical business. It provides High-end security, Advanced Analytics, Machine Learning, etc.

SQL Server Standard: It is suitable for Mid-Tier Application and Data marts. It includes basic reporting and analytics.

SQL Server WEB: It is designed for a low total-cost-of-ownership option for Web hosters. It provides scalability, affordability, and manageability capabilities for small to large scale Web properties.

SQL Server Developer: It is similar to an enterprise edition for the non-production environment. It is mainly used for build, test, and demo.

SQL Server Express: It is for small scale applications and free to use.

MS SQL Server as Client-Server Architecture

Let's have a look at the below early morning conversation between Mom and her Son, Tom.

Brainstorm:

Ask your brain….!!! "Can you map, who is CLIENT and who the SERVER is?"

The most certain reply would be - "I am pretty smart in that and…. Son is a CLIENT as he is requesting for a cup of coffee and Mother, who is CAPABLE of preparing coffee, is a SERVER."

Here, Tom is requesting his mother, a cup of coffee. Finally, mom does some processing with Milk, coffee, sugar and prepare coffee to serve it hot.

Analogy: MS SQL SERVER architecture.

A CLIENT is an application that sends requests to the MS SQL SERVER installed on a given machine. The SERVER is capable of processing input data as requested. Finally, respond with PROCESSED OUTPUT DATA as a result.

Key Components and Services of SQL Server

Database Engine: This component handle storage, Rapid transaction Processing, and Securing Data.

SQL Server: This service starts, stops, pauses, and continues an instance of Microsoft SQL Server. Executable name is sqlservr.exe.

SQL Server Agent: It performs the role of Task Scheduler. It can be triggered by any event or as per demand. Executable name is sqlagent.exe.

SQL Server Browser: This listens to the incoming request and connects to the desired SQL server instance. Executable name is sqlbrowser.exe.

SQL Server Full-Text Search: This lets user running full-text queries against Character data in SQL Tables. Executable name is fdlauncher.exe.

SQL Server VSS Writer: This allows backup and restoration of data files when the SQL server is not running. Executable name is sqlwriter.exe.

SQL Server Analysis Services (SSAS): Provide Data analysis, Data mining and Machine Learning capabilities. SQL server is integrated with R and Python language for advanced analytics. Executable name is msmdsrv.exe.

SQL Server Reporting Services (SSRS): Provides reporting features and decision-making capabilities. It includes integration with Hadoop. Executable name is ReportingServicesService.exe

SQL Server Integration Services (SSIS): Provided Extract-Transform and Load capabilities of the different type of data from one source to another. It can be view as converting raw information into useful information. Executable name is MsDtsSrvr.exe

SQL Server Instances

SQL Server allows you to run multiple services at a go, with each service having separate logins, ports, databases, etc. These are divided into two:

  • Primary instances
  • Named instances.

There are two ways through which we may access the primary instance. First, we can use the server name. Secondly, we can use its IP address. Named instances are accessed by appending a backslash and instance name.

For example, to connect to an instance named xyx on the local server, you should use 127.0.0.1\xyz. From SQL Server 2005 and above, you are allowed to run up to 50 instances simultaneously on a server.

Note that even though you can have multiple instances on the same server, only one of them must be the default instance while the rest must be named instances. One can run all the instances concurrently, and each instance runs independent of the other instances.

Importance of SQL Server Instances

The following are the advantages of SQL Server instances:

1. For installation of different versions on one machine

You can have different versions of SQL Server on a single machine. Each installation works independently from the other installations.

2. For cost reduction

Instances can help us reduce the costs of operating SQL Server, especially in purchasing the SQL Server license. You can get different services from different instances, hence no need for purchasing one license for all services.

3. For maintenance of development, production and test environments separately

This is the main benefit of having many SQL Server instances on a single machine. You can use different instances for development, production and test purposes.

4. For reducing temporary database problems

When you have all services running on a single SQL Server instance, there are high chances of having problems with the problems, especially problems that keep on recurring. When such services are run on different instances, you can avoid having such problems.

5. For separating security privileges

When different services are running on different SQL Server instances, you can focus on securing the instance running the most sensitive service.

6. For maintaining a standby server

A SQL Server instance can fail, leading to an outage of services. This explains the importance of having a standby server to be brought in if the current server fails. This can easily be achieved using SQL Server instances.

Summary:

  • SQL Server is defined as a relational database management system (RDBMS) developed by Microsoft
  • T-SQL means Transact-SQL, a propriety Language by Microsoft
  • Microsoft and Sybase released version 1.0 in 1989
  • Various Editions of SQL Server are Enterprise, Standard, Web, Developer, and Express
  • Critical components of SQL Server are Database Engine, SQL Server, SQL Server Agent, SQL Server Browser, SQL Server Full-Text Search, etc.
  • You can run multiple instances of SQL Server the same on the same machine.