Friday, October 3, 2014

MYSql : Migrate Data in Big Table

Few days back our client communicated that their archive tables on MYSQL instance are consuming a lot of disk space (> 50GB >10000000 rows)  and half of the data is no longer required.

We have to clean the data and schedule regular removal on datetime basis and we cannot change any configuration on server.

It was risky on production to execute direct DML  :( CTAS was an option too but when we compare with volume of data, it takes a lot time and server load.

MYSql supports bulk data operation with file very effectively.I have used that path and it was successful.

Following I have described that scenario for viewers. You can set your filter criteria and indexes accordingly.

Have a nice weekend :)

-- Create test table
CREATE TABLE `migtest` (
  `Id` INT(11) NOT NULL AUTO_INCREMENT,
  `name` VARCHAR(20) DEFAULT NULL,
  KEY `Id` (`Id`)
);

-- Create procedure to Load test data
DELIMITER #
CREATE PROCEDURE Load_Test()
BEGIN
  DECLARE v1 INT DEFAULT 0;

  WHILE v1 < 1000001 DO
    INSERT INTO migtest (NAME) VALUES (CONCAT('John',v1));
    SET v1=v1+1;
   
    IF MOD(v1,5000) = 0 THEN -- Commit every 5000 records
      COMMIT;
    END IF;
  END WHILE;
 END #

 -- Call stored porc to laod data
 CALL Load_Test();

 -- counting rows
 SELECT COUNT(*) FROM migtest;

 -- Creating New host table for data
 CREATE TABLE `migtest_b` (
  `Id` INT(11) NOT NULL AUTO_INCREMENT,
  `name` VARCHAR(20) DEFAULT NULL,
  KEY `Id` (`Id`));
 
  -- Exporting data to file.
  -- I am using limit to transfer records. You can use your criteria to export records
  SELECT * FROM migtest
-- WHERE  Your criteria
LIMIT 500000
INTO OUTFILE 'F:\\lg_test\\logs.csv'
FIELDS TERMINATED BY ','
ENCLOSED BY '"' LINES TERMINATED BY '\n';

SELECT '=============== importing File =======================================';   
# Populating new table
LOAD DATA INFILE 'F:\\lg_test\\logs.csv'  INTO TABLE migtest_b
FIELDS TERMINATED BY ','
ENCLOSED BY '"' LINES TERMINATED BY '\n';

 -- counting rows
 SELECT COUNT(*) FROM migtest_b;

-- Table Operations
RENAME TABLE migtest TO migtest_del;
RENAME TABLE migtest_b TO migtest;
DROP TABLE migtest_del;

Monday, September 8, 2014

Oracle : Enable Default Constraint on Table with Millions Records

Recently I got request from development team to add a  new column to an existing table with default value of sysdate.
This table has over 88 million records. if I add default constraint to that table directly, it will acquire exclusive table
lock on table and table is not be acessible for any activity during this operation.
Update this much records can take day(s) to complete, so we can't do this directly on production environment.

Following can be our checklist
1. Add column to table without default constraint
2- backfill new column
3- Enable default on new column

I have developed logic to select a chuck of record and update them.even this chunk can cause transactional locks, os I have
added logic to commit every 5000 records. Now I am in situation to execute it on any environment.




we can use CTAS as an alternate option, I will cover that in my next post.


sqlplus user/pwd;
-----------------------------------------
--Creating test table
-----------------------------------------

CREATE TABLE MyUpdtest
  ( id NUMBER,
    name varchar2(100)
  );
 

-----------------------------------------
--Populating test table
-----------------------------------------
declare
varcount number := 0;
BEGIN
  FOR t IN 1..4000000
  LOOP
  INSERT INTO MyUpdtest
  VALUES (t, 'Employee'||t );
  varcount := varCount+1;
  IF mod(varCount, 5000) = 0 THEN -- Commit every 5000 records
    COMMIT;
  END IF;
END LOOP;
END;
/

-----------------------------------------
--1- Add new column
-----------------------------------------

ALTER TABLE MyUpdtest ADD CreateDate DATE;

-----------------------------------------
--2- backfill new column
-----------------------------------------
/*
Following is the core script to update records.It is suggested to transform it in shell script and schedule to execute in
off-peak hours.
*/
spool on
spool '/loc/upd.txt' -- set spool location accordingly

SELECT TO_CHAR(sysdate,'dd/mm/yyyy HH:MM:SS') AS StartTime FROM dual;

select count(*) as totalRowsToProcess  from MyUpdtest where CreateDate is null;
declare
varcount number := 0;
varcount2 number := 0;
varDte date;
begin
--setting counter to set loop condition
SELECT COUNT(*) INTO varcount2 FROM MyUpdtest WHERE CreateDate IS NULL;
-- setting current date portion only
SELECT to_date(TO_CHAR(sysdate,'dd/mm/yyyy'),'dd/mm/yyyy')
  INTO varDte
  FROM dual;
WHILE varcount2 > 0 --quits if 0 or less
LOOP                -- while loop start
  FOR r IN
        (SELECT ID,
          Name
        FROM MyUpdtest
        WHERE CreateDate IS NULL
        AND rownum        <200001
        ) --limitint batch to 200000
  LOOP
    UPDATE MyUpdtest
    SET CreateDate         = varDte
    WHERE Id             = r.id
    AND name           =r.name;
    varcount              := varCount+1;
    IF mod(varCount, 5000) = 0 THEN -- Commit every 5000 records
      COMMIT;
    END IF;
  END LOOP;-- for loop ends
  COMMIT;
  varcount2:=varCount2-200000;
END LOOP; --ends while loop
END;
/
select to_char(sysdate,'dd/mm/yyyy HH:MM:SS') as EndTime from dual;
spool off

-----------------------------------------
--3- Enable Default
-----------------------------------------
ALTER TABLE MyUpdtest MODIFY CreateDate DEFAULT sysdate;

Monday, July 7, 2014

MSSQL TSql Interview Questions

There may be other ways too to solve same question. I am posting one I know

--=================================================================
--- Creating Temp Table
--=================================================================
CREATE TABLE #tmpEmpSal
(
Ename VARCHAR(20),
Dept  VARCHAR(10),
Salary INT
);
--=================================================================
--Populating temp table
--=================================================================
INSERT INTO #tmpEmpSal
SELECT 'Emp1','Dept1',10000 UNION ALL 
SELECT 'Emp2','Dept1',1000 UNION ALL
SELECT 'Emp3','Dept1',12000 UNION ALL
SELECT 'Emp4','Dept1',13000 UNION ALL
SELECT 'Emp5','Dept2',10000 UNION ALL
SELECT 'Emp6','Dept2',5000 UNION ALL
SELECT 'Emp7','Dept2',12000 UNION ALL
SELECT 'Emp8','Dept3',11000 UNION ALL
SELECT 'Emp9','Dept3',10000 UNION ALL
SELECT 'Emp10','Dept3',10000 UNION ALL
SELECT 'Emp11','Dept3',4000 UNION ALL
SELECT 'Emp14','Dept3',14000 UNION ALL
SELECT 'Emp12','Dept4',70000 UNION ALL
SELECT 'Emp13','Dept4',10000 ;

--=================================================================
-- Select lowest salary in Each department
--=================================================================
WITH myCte AS
(
SELECT
        dept,
        salary,
        ROW_NUMBER() OVER ( PARTITION BY dept ORDER BY salary asc ) AS RowNum
FROM    #tmpEmpSal
)
SELECT dept,
       salary
  FROM myCte
 WHERE RowNum=1;

--====================================
--- select nth highest record
--====================================

--Frist select all distinct salary to see nth number record
SELECT  DISTINCT salary
  FROM #tmpEmpSal
 ORDER BY Salary DESC;

--Query to fetch nth record
SELECT MIN(salary)
  FROM #tmpEmpSal
 WHERE Salary IN (
        SELECT  DISTINCT TOP 6  Salary
          FROM #tmpEmpSal --replace top to required nth position
         ORDER BY Salary desc);

--=================================================================
-- Select total number if employees who are having salary as 10000
--=================================================================
SELECT salary ,COUNT(*) Employees
 FROM  #tmpEmpSal
GROUP BY Salary
HAVING Salary = 10000 ;

--=================================================================
--Select salary and number of poeple having this salary
--=================================================================

SELECT salary ,COUNT(*) NoOfEmployeesHavingThisSalary
 FROM  #tmpEmpSal
GROUP BY Salary

--=================================================================
--Select repeated/duplicated salary.salary that exist mmore than one
--=================================================================
SELECT salary ,COUNT(*) NoOfEmployeesHavinfThisSalary
 FROM  #tmpEmpSal
GROUP BY Salary
HAVING COUNT(Salary) >1 ;
--DROP TABLE #tmpEmpSal;

Friday, May 2, 2014

Remove HTML Tags from MSSQL Column Data

Recently I have assigned a task to format data in column to display  and remove Html tags and related special characters. It was fun to develop that code. all we need to identify start and end index of htlm tag "< >" to replace with empty string.

Here you go ..

--Create temp table
CREATE TABLE #tmp(PrdName VARCHAR(10),HtmlData VARCHAR(2000)
);
 

--Loading test values

INSERT INTO #tmpSELECT 'prd1','<b class="newitem"></b> text<sup>&#174;</sup> - prd1' UNION ALLSELECT 'prd2','<b>prd2</b>' UNION ALLSELECT 'prd3','<b>prd3</b>' ;

--================================

--Create Function to remove html tags

--================================
 
CREATE  FUNCTION fnClearHTMLTags ( @HTMLString NVARCHAR(MAX) )RETURNS NVARCHAR(MAX)AS
BEGIN
DECLARE    
@Start INT ,
            
@End INT ,
            
@Length INT
    
  WHILE CHARINDEX
('<', @HTMLString) > 0 AND CHARINDEX('>', @HTMLString, CHARINDEX('<', @HTMLString)) > 0
  
BEGIN
  SELECT  
@Start  = CHARINDEX('<', @HTMLString) , --selecting start index of html tag
          
@End    = CHARINDEX  ('>', @HTMLString, CHARINDEX('<', @HTMLString)) -- selecting last index of html tag
  
SELECT  @Length = ( @End - @Start ) + 1  --selecting length of html tag data to remove
        
      
IF  @Length > 0
  
BEGIN
  
--Stuff function will replace data between <> to empty string ''
   -- replace function is used to replace other special characters e.g. &,# to empty string
  
SELECT  @HTMLString = REPLACE(REPLACE(REPLACE(REPLACE(STUFF(@HTMLString,@Start, @Length,''), '&', ''),'#', ''), '; -','.'), '*', '')
              
    
END
     END
    
RETURN LTRIM(RTRIM(@HTMLString ))

END  ;
--================================

--Display Data

--================================
  

 SELECT prdName,
      
HtmlData,
      
dbo.fnClearHTMLTags(HtmlData) AS CleanHtmlData
FROM #tmp;




--================================

--Drop Temp Table

--================================
 

DROP TABLE #tmp;-- Drop Function
-- DROP FUNCTION fnClearHTMLTags;

Tuesday, March 11, 2014

Oracle Schema Security Against Authorize Access

Few time back our dev team reported that application started crash and the reason was missing table.

As it was a legacy application and dev knew schema credentials. One of the dev dropped a table and application started throwing error.

This was alarming and I decided to find suitable way to overcome such situation in future.we can simply  create a new user and assign required privileges to new user.But as we can't change application , it was difficult to implement it. we have to avoid solution where we have to provide schema qualification to access objects.

Initially I though about but synonyms but it seems additional overhead to maintain then I found current_schema option from Oracle documentation.

Here is the step by step guide for you to implement and test that solution.

--Connect as sys
1 --Create UserA
CREATE USER USERA IDENTIFIED BY usera;

2 --Grant required privileges to userA
  GRANT CONNECT TO USERA;
  GRANT RESOURCE TO USERA;

3 -Connect as userA
sqlplus usera/usera

SQL*Plus: Release 10.2.0.1.0 - Production on Tue Mar 11 20:36:04 2014

Copyright (c) 1982, 2005, Oracle.  All rights reserved.

Connected to:
Oracle Database 10g Enterprise Edition Release 10.2.0.1.0 - 64bit Production
With the Partitioning, OLAP and Data Mining options

SQL>                             

4 --Create table
  CREATE TABLE USERA_TEST
    ( Id NUMBER, Name VARCHAR2(20)
    );
   
5 --Insert test values USERA.USERA_TEST
INSERT INTO usera_test
SELECT 1,'Val1' FROM dual
UNION ALL
SELECT 2,'Val2' FROM dual
UNION ALL
SELECT 3,'Val3' FROM dual;
COMMIT;
   
6- -Create stored proc
CREATE PROCEDURE usera_sp
AS
BEGIN
  dbms_output.put_line('This is USERA stored Proc');
END;
/

7 -- disconnect USERA

8 --  Connect as sys and process following

9 --Creating role
CREATE ROLE ROLE_USERA_PRIVS;

10 --Create UserB
CREATE USER USERB IDENTIFIED BY userb;

GRANT CONNECT TO USERB;
 
11 --Create Grant Script
Spool ON
spool /path/usera_Privs.sql -- path can be any of your preference

SELECT
  CASE
    WHEN object_type ='TABLE'
    THEN 'Grant select on USERA.'||Object_Name||' to ROLE_USERA_PRIVS;'
    WHEN object_type IN ('PROCEDURE','FUNCTION')
    THEN 'Grant Execute on USERA.'||Object_Name||' to ROLE_USERA_PRIVS;'
    WHEN object_type IN ('SEQUENCE')
    THEN 'Grant SELECT on USERA.'||Object_Name||' to ROLE_USERA_PRIVS;'
    WHEN object_type IN ('MATERIALIZED VIEW','VIEW')
    THEN 'Grant select on USERA.'||Object_Name||' to ROLE_USERA_PRIVS;'
  END AS script
FROM dba_objects
WHERE owner='USERA' ;

spool OFF

12 -- Execute script usera_Privs.sql

13 -- Grant role to UserA and disconnect sys
Grant ROLE_USERA_PRIVS to USERB;

14 -- Connect as USERB
--Now if you will connect to USERB and execute following you will have error "ORA-00942: table or view does not exist"
SQL> select * from usera_test;
select * from usera_test
              *
ERROR at line 1:
ORA-00942: table or view does not exist 

However schema qualification will work
SQL> SELECT * FROM USERA.usera_test;

        ID NAME
---------- --------------------
         1 Val1
         2 Val2
         3 Val3
       
15 -- Disconnect USERB

16 -- Connect as sys.
17 -- Create Trigger to change current_schema to required one. So that our application/user need not to mention schema qualification

SQL>  CREATE OR REPLACE TRIGGER sys.afterlogon_myapp AFTER logon ON DATABASE
  DECLARE
  BEGIN
    -- Trigger Used to Set Correct Schema based on Role
    IF (dbms_session.is_role_enabled('ROLE_USERA_PRIVS')) THEN
      EXECUTE immediate 'alter session set current_schema=USERA';
    END IF;
  END;
  /

18 -- Disconnect sys

19 -- connect USERB 
sqlplus userb/userb

SQL*Plus: Release 10.2.0.1.0 - Production on Tue Mar 11 20:26:09 2014

Copyright (c) 1982, 2005, Oracle.  All rights reserved.


Connected to:
Oracle Database 10g Enterprise Edition Release 10.2.0.1.0 - 64bit Production
With the Partitioning, OLAP and Data Mining options
-- execute sql without schema qualification
SQL> select * from usera_test;

        ID NAME
---------- --------------------
         1 Val1
         2 Val2
         3 Val3
-- execute stored proc without schema qualification       
SQL> set serveroutput on
SQL> execute usera_sp
This is USERA stored Proc

PL/SQL procedure successfully completed.
                                          
Enjoy !

Tuesday, December 24, 2013

Handle Pagination in Stored Procedure

On one of my project I had requirement from team to write a stored proc that take page number and returns results accordingly.It was fun to develop that logic.
I am sharing it for readers to help them writing similar code. following code is developed using MSSQL 2005 .
I hope this will help others.

Table Structure.
-----------------
CREATE TABLE [dbo].[tblQuantityPricing](
    [QuantityPricingId] [int] IDENTITY(1,1) CONSTRAINT [PK_tblQuantityPricing] PRIMARY KEY CLUSTERED ,
    [QuantityFrom] [int] NULL,
    [QuantityTo] [int] NULL,
    [Price] [decimal](16, 4) NULL,
    [ItemId] [int] NULL

);

--Sample DATA loading
-----------
INSERT INTO tblQuantityPricing(QuantityFrom,QuantityTo,Price,ItemId)
SELECT 10,20,12.5,123
UNION ALL
SELECT 10,20,12.5,1234
UNION ALL
SELECT 10,20,12.5,1235
UNION ALL
SELECT 10,20,12.5,1231
UNION ALL
SELECT 10,20,12.5,1234
UNION ALL
SELECT 10,20,12.5,1234
UNION ALL
SELECT 10,20,12.5,1235
UNION ALL
SELECT 10,20,12.5,1235
UNION ALL
SELECT 10,20,12.5,1236
UNION ALL
SELECT 10,20,12.5,1236
UNION ALL
SELECT 10,20,12.5,1236
UNION ALL
SELECT 10,20,12.5,1236
UNION ALL
SELECT 10,20,12.5,1238
UNION ALL
SELECT 10,20,12.5,122;


Stored Proc

------------

CREATE PROCEDURE spQuantityPricingByPageGet
    (
      @PageNum TINYINT ,
      @TotalRows INT OUTPUT
    )
AS
    BEGIN
        DECLARE @PageSize INT ;
-- pages size= rows per page. Currently fixed as 10 rows/page. You can set it accordingly or can use input parameter to let application decide
        SET @PageSize = 5 ;
   
-- Page wise result fetch
        WITH    QPrice
                  AS ( SELECT   qp.QuantityPricingId ,
                                qp.itemId ,
                                qp.QuantityFrom ,
                                qp.QuantityTo ,
                                qp.Price,
                                ROW_NUMBER() OVER ( ORDER BY ItemID ASC ) AS RowIndex
                       FROM     dbo.tblQuantityPricing qp
                    
                     )
            SELECT  *
            FROM    Qprice
            WHERE   Qprice.RowIndex BETWEEN ( @PageNum - 1 ) * @PageSize + 1
                                    AND     @PageNum * @PageSize ;
                      
-- Counting Total Rows                                 
        SELECT  @TotalRows = COUNT(*)
        FROM    dbo.tblQuantityPricing;

  

    END ; 

-- Sample Execution
DECLARE @totalRows INT;
EXECUTE spQuantityPricingByPageGet 2,@TotalRows = @TotalRows OUTPUT;
SELECT    @TotalRows as N'@TotalRows'

-- Drop Proc
--DROP PROCEDURE  spQuantityPricingByPageGet
  
--Drop table
--DROP TABLE tblQuantityPricing

Thursday, December 12, 2013

Parametrized OpenQuery

While working on an ETl routine, I need to pass parameter to OpenQuery sql.

I searched and found on most of the place that it can be done only with dynamic sql. I want to avoid dynamic sql.
atlast I found Jeff Moden post on sqlservercentral.com. I am posting it here with the intentions that it may help more peoples and  may come up in search engine results.
Thanks to Jeff Moden.


select <stuff ...>
from OPENQUERY(Banner_Prod, 'SELECT a.Major
,a.FIRSTNAME AS NameFirst
,a.MIDDLENAME AS NameMiddle
,a.LASTNAME AS NameLast
,a.CLASSCODE AS PersonSubType
FROM fredtrailer.student_Directory_info a'
) oq
WHERE oq.STUD_ID = @STUD_ID