• About Dangerous DBA
  • Table of Contents
Dangerous DBA A blog for those DBA's who live on the edge

Category Archives: Varchar

Shredding XML in DB2 – XML Document shredding

August 9, 2012 11:00 am / Leave a Comment / dangerousDBA

XML is most useful, especially when it is well formatted and you know what it contains and all the documents stored in your XML column are the same. When they are not and you can’t exploit fully DB2’s very good XML features, and you don’t know how to query your XML it can be very hard.

The problem

Millions of records of business information with data in them, no one really knows what’s in there, and no one knows what we might be missing out on. There is a solution in place at the moment that uses a Java application to read the record from DB2 parse it as a string, this is because the element names that you could use XQuery against are all in different cases and appear in different locations in the XML therefore writing an XQuery against it is near impossible without missing something, which the Java App does anyway.

The solution – XML Shredder

I can’t and don’t take credit fully for my solution as it is largely based on this article How to list the paths of all elements in an XML document? – Dr.Matthias Nicola. I have adapted it to list both the element name, path and value. First of all we will look at a simple version, then a more advanced probably more useful version with a built in function that I found very useful so that you don’t parse a document bigger than you buffer pool attached to the temp tablespaces where the query is run.

XML Shredder – Basic Code

WITH pathstable (name, node, xpath, value) AS (
  SELECT x.name AS name, 
         x.node AS xmlnode,
         '/' || x.name AS xpath,
         x.value as value
  FROM {SCHEMA}.{TABLE} {ALIAS} ,
       XMLTABLE('${XML COLUMN YOU WISH TO SHRED}/*'
        COLUMNS
          name varchar(300) PATH './local-name()',
          value varchar(12000) PATH 'xs:string(.)',
          node    XML PATH '.') AS x
  WHERE {ALIAS}.{FIELD} etc etc
  UNION ALL
  SELECT y.name AS name, 
         y.node AS xmlnode, 
         xpath|| '/' || y.name AS xpath,
         y.value as value
  FROM pathstable,
       XMLTABLE('$XMLNODE/(*,@*)' PASSING pathstable.node AS "XMLNODE"
        COLUMNS
         name varchar(300) PATH 'local-name()',
         value varchar(12000) PATH 'xs:string(.)',
         node    XML PATH '.') AS y
) SELECT name, xpath, value
  FROM pathstable

So that's the basic version of the SQL and you will probably supply the WHERE clause something to identify the row that you wish to shred. The section "'${XML COLUMN YOU WISH TO SHRED}/*'" you need to keep the '$ and /*' to wrap your XML column. It will get you a result set something like:

name          xpath                        value
-------------------------------------------------------
email         /personal/email              a@b.com
first_name    /personal/name/first_name    Philip

GOTCHA's

  1. The VARCHAR (in this case 12000) size attached to the value column needs to be long enough to all the values strung together, but if you are going to insert it anywhere this long return value can be taken out in the WHERE clause as it is useless any case and therefore the column that you insert into can be far shorter.
  2. Make sure that you can inline you XML otherwise this code will not work. This can be done using ADMIN_EST_INLINE_LENGTH({XML FIELD}) result is -1 as explained below.

XML Shredder - "Advanced" Code

So the original "Basic" code gave you no identifier to the XML, whatever yours may be so by using something like the below you can create an extended "triple-store" a quad store. I do this as the same element name can exist down different paths and can mean slightly different things depending on where it exists, plus the identifier for the row then it can make it unique and useful.


WITH pathstable ({ADDITIONAL FIELD(S)}, name, node, xpath, value) AS (
SELECT {ADDITIONAL FIELD(S)},
     x.name AS name, 
     x.node AS xmlnode,
     '/' || x.name AS xpath,
     x.value as value
FROM {SCHEMA}.{TABLE} {ALIAS} {JOIN TYPE} JOIN {SCHEMA}.{TABLE} {ALIAS} ON {ALIAS ... etc, etc},
       XMLTABLE('${XML COLUMN YOU WISH TO SHRED}/*'
    COLUMNS
      name varchar(300) PATH './local-name()',
      value varchar(12000) PATH 'xs:string(.)',
      node    XML PATH '.') AS x
WHERE ADMIN_EST_INLINE_LENGTH(XML FIELD) <> -1
    AND {ANY OTHER WHERE CLAUSES THAT AFFECT THE RESULT SET}
UNION ALL
SELECT {ADDITIONAL FIELD(S)},
     y.name AS name, 
     y.node AS xmlnode, 
     xpath|| '/' || y.name AS xpath,
     y.value as value
FROM pathstable,
   XMLTABLE('$XMLNODE/(*,@*)' PASSING pathstable.node AS "XMLNODE"
    COLUMNS
     name varchar(300) PATH 'local-name()',
     value varchar(12000) PATH 'xs:string(.)',
     node    XML PATH '.') AS y
) SELECT {ADDITIONAL FIELD(S)}, name, xpath, value
FROM pathstable
WHERE NAME {SOME PREDICATE} {SOME VALUE}
          OR
      VALUE {SOME PREDICATE} {SOME VALUE}
          OR
      {ADDITIONAL FIELD(S)} {SOME PREDICATE} {SOME VALUE}

In addition to the all the new code there to identify the record so you could store it in a table to use later you will notice the line ADMIN_EST_INLINE_LENGTH({XML FIELD}) <> -1. This inline function, full information from IBM DB2 Info Center here is used to make sure that no record that is too long too process. This is needed as your XML is stored separately therefore is not restricted by the page size of table space where your XML holding table exists and if you do not have a temporary table space large enough then this code will not work.

Future posts

Looking at how you could make this into a useful system, with the ability to be queried easily



Posted in: ADMIN_EST_INLINE_LENGTH, DB2 built in functions, DB2 Data Types, DB2 Development, Decompose XML, IBM DB2 LUW, Recursive Query, Recursive SQL, Shred XML, Varchar, XML, XMLTABLE / Tagged: ADMIN_EST_INLINE_LENGTH, DB2 Development, Decompose XML, IBM DB2 LUW, Recursive Query, Recursive SQL, Shred XML, Varchar, XML, XML PATH, XMLTABLE

Dates to “Long” – Friday function

August 3, 2012 1:00 pm / 3 Comments / dangerousDBA

Time for another Friday function.

The ask

This time I was asked by our marketing bods to come up with something that made the dates in the SQL that is merged into our marketing and informational emails (glorified mail merge) from a dd/mm/yyyy (13/06/2012) or yyyy-mm-dd (2012-06-13) to a ddth MMM YYYY so becomes 13th June 2012.

They had been trying to do this themselves, with some mixed and non standard ways of doing it, generally it just made the code really hard to debug when they had issues, personally I could not find anything in the IBM info centre or online to do this so I created them a UDF that takes in a date and supplies out a “long” format date.

DB2 Date to “Long” format – Code


CREATE FUNCTION GLOBAL.CONVERT_DATE_LONG (IN_DATE DATE)
DETERMINISTIC
NO EXTERNAL ACTION
RETURNS VARCHAR(30)
LANGUAGE SQL
BEGIN ATOMIC
    ----------------------------------------------------------------------------
    ----------------------------------------------------------------------------
    --Takes Date in creates a long date
    ----------------------------------------------------------------------------
    ----------------------------------------------------------------------------

    ----------------------------------------------------------------------------
    ----------------------------------------------------------------------------
    --Declare Vars
    DECLARE OUT_DATE VARCHAR(30);
    DECLARE DATE_PART VARCHAR(4);
    DECLARE MONTH_PART VARCHAR(20);
    DECLARE YEAR_PART  VARCHAR(4);

    IF(IN_DATE IS NOT NULL) THEN
        ----------------------------------------------------------------------------
        ----------------------------------------------------------------------------
        --Set some vars
            --Get the Base Parts
            SET DATE_PART = DAY(IN_DATE, 2);
            SET MONTH_PART = MONTHNAME(IN_DATE);
            SET YEAR_PART = YEAR(IN_DATE,4);

            --Remove the 0 from the front of the date part
            IF(DATE_PART IN ('01','02','03','04','05','06','07','08','09')) THEN
                SET DATE_PART = RIGHT(DATE_PART,1);
            END IF;

            --Put the th nd or rd in the DATE_PART
            IF(DATE_PART IN ('3','23')) THEN 
                SET DATE_PART = DATE_PART || 'rd';
            ELSEIF (DATE_PART IN ('2','22')) THEN
                SET DATE_PART = DATE_PART || 'nd';
            ELSEIF (DATE_PART IN ('1','21','31')) THEN
                SET DATE_PART = DATE_PART || 'st';
            ELSE 
                SET DATE_PART = DATE_PART || 'th';
            END IF;

        ----------------------------------------------------------------------------
        ----------------------------------------------------------------------------
        --Set and return the result    
        SET OUT_DATE = DATE_PART || ' ' || MONTH_PART || ' ' || YEAR_PART;

        RETURN OUT_DATE;
   ELSE
        RETURN NULL;
   END IF;

END

Queston?

Is it that there is not the function built into DB2 that does this or does everyone have something similar for the audience that they deal with?

Future Post

I will be looking at converting an integer that represents seconds into a time



Posted in: Convert, Date, DB2 Administration, DB2 Data Types, DB2 Development, IBM DB2 LUW, UDF, Uncategorized, User Defined Function, Varchar / Tagged: convert, Date, DB2, DB2 Administration, DB2 Development, IBM DB2 LUW, UDF, User Defined Functions, Varchar

Time to “English” – Friday Function

July 20, 2012 12:00 pm / 2 Comments / dangerousDBA

I though about starting a new series that I will add to over time the Friday function, I might not be able to create something as useful as Serge Rielau (@srielau) but maybe it will helpful to someone. Over the last few days I have been asked to some interesting things with “my” data like converting times to a more “friendly” format.

The ask

To convert my nicely cleansed times (19:47:45 or 04:23:32) from various non standard sources (the wonders of Big Data) into a time format that is more compatible with your average email reader (7:47pm or 4:23am) as opposed to a techie that might be all right with the nice 24 hour times. I did a lot of searching around and I could not find any inbuilt functions of DB2 LUW that would do this and so I had to create my own user defined function. Please if you know of anything feel free to correct me, but searching the info center then there was nothing.

DB2 Time to “English” – Code

CREATE FUNCTION GLOBAL.CONVERT_TIME_12H(IN_TIME TIME)
DETERMINISTIC
NO EXTERNAL ACTION
RETURNS VARCHAR(10)
LANGUAGE SQL
BEGIN ATOMIC
    ----------------------------------------------------------------------------
    ----------------------------------------------------------------------------
    --Takes 24hrs time in gives back 12hrs time suffixed with AM or PM
    ----------------------------------------------------------------------------
    ----------------------------------------------------------------------------
    DECLARE OUT_TIME VARCHAR(10);

    IF(INT(LEFT(IN_TIME,2)) <= 12) THEN
        SET OUT_TIME = LEFT(IN_TIME,5) || 'am';

        IF(INT(LEFT(IN_TIME,2)) IN (10,11)) THEN
            RETURN REPLACE(OUT_TIME,'.',':');
        ELSE
            RETURN REPLACE(SUBSTR(OUT_TIME,1),'.',':');
        END IF;

    ELSE
        IF(INT(LEFT(IN_TIME,2)) = 24) THEN
            SET OUT_TIME = '00:00am';
        ELSE
            SET OUT_TIME = LEFT((IN_TIME - 12 HOURS),5) || 'pm';
        END IF;

        RETURN REPLACE(OUT_TIME,'.',':');
    END IF;

END

As you can see it takes in standard DB2 time, and I think managed to get all the gotchas like midnight not being 12 hours less. The downside of this function is that the format becomes text and therefore not useable so would not recommend using it for anything but for human format. As I said before if you know of a function to do this in DB2 that is built in I would love to know.

Future articles

In future Friday Functions then I will look at covering non standard asks that I have been required to do. I think after this one I will be needing to create a function to “elongate” a date!



Posted in: Convert, DB2 Data Types, DB2 Development, IBM DB2 LUW, Time, UDF, User Defined Function, Varchar / Tagged: convert, DB2, IBM DB2 LUW, Time, UDF, User Defined Functions, Varchar

DB2 Errors Solved: CLI0109E String data right truncation: SQLCODE=-1822, SQLSTATE=560BD

February 28, 2012 1:56 pm / Leave a Comment / dangerousDBA

So have you ever seen this error message, and it might be an intermittent fault which makes it an even better problem to try and solve:

Unexpected error code “22001” received from data source “<DATABASE>”. Associated text and tokens are “func=”open” msg=” CLI0109E  String data right truncatio”.. SQLCODE=-1822, SQLSTATE=560BD, DRIVER=4.0.100

We had this issue on some production code that did not raise its head till a few months after it had been working fine. The issue seemed to be that a piece of SQL that was generating the error was joining two federated tables on columns of different data types and sizes.  The join in question was a VARCHAR(5) to a VARGRAPHIC(50) over federation, these two data types can be joined together and the size does not matter in general until the VARGRAPHIC column went over 5 characters long,  causing the intermittent  error!

The solution apart from the obvious one we were facing with our data was to CAST the VARGRAPHIC into a VARCHAR(5) the same as the field that it was joining too. This is working now and has fixed the issue and is one to think about for the future.

Another short interesting one for a lunchtime post.

Posted in: Data types, DB2, DB2 Administration, DB2 built in functions, DB2 Maintenance, IBM, Triton, Varchar, Vargraphic / Tagged: data types, intermittent fault, sqlcode, string data, unexpected error, varchar

IDUG – EMEA – 16th – Day Three

November 16, 2011 11:04 pm / Leave a Comment / dangerousDBA

Today was another long day, but was ended by an excellent dinner put on by IBM to thank its customers, with ostrich leg and proper sushi so now we know where all out licensing fee goes!! The talks that I attended did not teach me as much as I had hoped, but I did learn something in each of them though so not a total waste of time.

A DBA’s guide to using TSA – Fredric Engelen

This covered the basics of HADR and then went on to cover how you set up the TSA to take over the HADR, and did not cover the TSM that I hoped it would that I will be implementing soon at Holiday extras. Learn’t:

  1. db2rfpen – Will let force the rollforward of the primary database.

Managing DB2 Performance in an Heterogeneous environment – Jim Wankowski

This covered the differences and similarities between DB2 LUW and DB2 z/OS. Although it was informative I feel the title was not correct for the session and should have been different. I learnt:

  1. When a Sort happens on a VARCHAR column then the column is expanded to its full length – I may ask this question to Scott Hayes when I do his Rocket Science Seminar on Friday

Deep Dive into DB2 LUW offline table and index reorg – Saeid Mohseni

This session was very good, if you are a frequent reader of my blog then you will know that I am trying to get a straight answer to my questions on Reorganisation and Runstats in DB2 and so I got confirmed and learnt:

  1. DB2 reorgs need the current runstats on the table to be correct to give the correct results for the reorganisation identifying stored procedure.
  2. You can parallel run a reorg on a partitioned table index as long as the first, and subsequent runnings do not allow reads.

Data Warehousing – SIG

This was a little disappointing as it did not have an agenda so was unstructured, and I would have liked to have had a little more information on how it was going to be run. It was informative and if any one has heard of “Data Vaulting” then there is a lady from the Netherlands that would really like to know.

Back to the fifties . . . . . 50 fabulous ways for forecasting failures, flaws and finding flubber – Alexander Kopac

This was an excellent talk and there is a lot to try out when I get back home and enough work to keep us going for week probably. The presenter dressed up as a wizard and the bits of SQL he has given in the slides will hopefully make the DB2 team at HX wizards.One main thing to remember is:

  1. KISS – Keep It Simple Stupid

Useful but widely unknown DB2 Functions – Michael Tiefenbacher

Second talk from this guy and if I did not already know, used or have blogged about all the things that he presented this would have been an extreamly useful and I really should have read the Agenda better before getting in there.
And to the final talk of the day:

DB2 LUW Index design, best practice and case studies – Scott Hayes

This was a very good talk and used in conjunction with Alexander’s information I think will build a framework for reviewing indexes and designs at HX. I learnt that:
  1. I need to read up on CLUSTERED indexes
  2. Single column indexes are not good, even though it is the recommended by IBM
  3. You need a good problem statement to come up with a good solution – Can be applied to everything in life.
Tomorrow is the last day of the conference and so it finishes pretty early and so I might get some sight seeing done in the afternoon, but before that I plan on attending:

Thursday, November 17, 2011

08:30 AM – 09:30 AM
Session 15
1899:I/O, I/O, it’s off to Disk I go – I/O Optimization, Elimination, & SSD (Aquarius)
09:45 AM – 10:45 AM
Session 16
2194:Database I/O in the Brave New World (Aquarius)
11:15 AM – 12:15 PM
Session 17
1892:Understanding and Tuning Page Cleaning in DB2 (Aquarius)
12:30 PM – 01:30 PM
Thursday DB2 Panel
So have a good night and see you all in the morning.
Posted in: Data types, DB2, DB2 Administration, DB2 Ecosystem, DB2 Maintenance, EMEA, IBM, IDUG, Reorg Index, Reorg Table, Reorganise Index, Varchar

Post Navigation

← Older Posts
 

Subscribe to Blog via Email

Enter your email address to subscribe to this blog and receive notifications of new posts by email.

Join 757 other subscribers

Recent Posts

  • Self generating Simple SQL procedures – MySQL
  • Google Cloud Management – My Idea – My White Whale?
  • Position Tracker – The Stub – Pandas:
  • Position Tracker – The Stub
  • Position Tracker – In the beginning
  • Whats been going on in the world of the Dangerous DBA:
  • QCon London Day 1
  • Testing Amazon Redshift: Distribution keys and styles
  • Back to dangerous blogging
  • DB2 10.1 LUW Certification 611 notes 1 : Physical Design

Dangerous Topics

added functionality ADMIN_EST_INLINE_LENGTH Bootcamp colum convert data types DB2 db2 DB2 Administration DB2 Development db2advis db2licm Decompose XML EXPORT GCP Google IBM IBM DB2 LUW idug information centre infosphere IOT LOAD merry christmas and a happy new year Position Tracking python Recursive Query Recursive SQL Reorganisation Reorganise Reorganise Indexes Reorganise Tables Runstats sql statement Stored Procedures SYSPROC.ADMIN_CMD Time UDF User Defined Functions V9.7 V10.1 Varchar XML XML PATH XMLTABLE

DangerousDBA Links

  • DB2 for WebSphere Commerce
  • My Personal Blog

Disclaimer:

The posts here represent my personal views and not those of my employer. Any technical advice or instructions are based on my own personal knowledge and experience, and should only be followed by an expert after a careful analysis. Please test any actions before performing them in a critical or nonrecoverable environment. Any actions taken based on my experiences should be done with extreme caution. I am not responsible for any adverse results. DB2 is a trademark of IBM. I am not an employee or representative of IBM.

Advertising

© Copyright 2022 - Dangerous DBA
Infinity Theme by DesignCoral / WordPress