PowerShell – Working with xml (via DOM and XPATH)

Let’s say you have the following xml file:

PS C:\> $myxmldata = Get-Content C:\temp\credentials.xml
PS C:\> $myxmldata


  codingbee
  mysecret

PS C:\>

if you have xml data in an xml file, then you load it into powershell like this:

PS C:\> [xml]$myxmldata = Get-Content C:\temp\credentials.xml
PS C:\> $myxmldata

xml                                                         scom
---                                                         ----
version="1.0" encoding="UTF-8"                              scom


PS C:\>

Note: don’t use “import-clixml” (which we covered earlier) because that can only accept xml files that were generated by powershell in the first place (using “export-clixml”).

You can create an “xml object variable” from an xml file like this:

[xml]$myxmldata = Get-Content .\samplexml.xml

 

Once this is done, you can then navigate/view/browse/retrieve-values from $myxmldata just like any ordinary hierarchiacal object, which contains lots of nested objects.

Note: xml tags are actually case sensitive. e.g. :

########## detour – start

PS C:\=> $myxmldata xml parameters — ———- version=”1.0″ parameters

 

# Here, the “xml” column actually represents the first xml tag in the xml code, along with it’s “version” attribute. It is a single-tag, # similar to the like of <br /=> and <hr /=> that you would find in html # The second “parameters” column has a value of the same name. This is powershell’s way of telling you that “parameters” is a an outer tag # that contains more nested tags….hence we can drill donw into it, like this: PS C:\=> $myxmldata.parameters envname : cde9 # This is actually an attribute for the “parameters” tag. stage : 5 # This is actually an attribute for the “parameters” tag. version : 1.0 # This is actually an attribute for the “parameters” tag. modifiedDate : 1382431774679 # This is actually an attribute for the “parameters” tag. createdDate : 1379082771334 # This is actually an attribute for the “parameters” tag. modifiedBy : OS\SChowdhury # This is actually an attribute for the “parameters” tag. active : true # This is actually an attribute for the “parameters” tag. servers : servers # Ps’s way of indicating that an element contains nested elements is showing a parameter has the same value as it’s name. schema : schema # Here is another elements containing nested elements. environment : environment # Here is another elements containing nested elements. source : source # Here is another elements containing nested elements. # You can identify inner tags from the attributes, by the fact that inner tags parameter name and value are the same. The values are not in # curly braces because in this case there are just one instances of each. E.g. there is only one opening an closing tag for the “servers” tag. PS C:\=> $myxmldata.parameters.schema dbservicename : ARCACDE9.WORLD sid : ARCACDE9 modifiedDate : 1363690045613 modifiedBy : OS\chowdhury services : services entry : {entry, entry, entry, entry…} # here is another nested object, containing multiple entries of the same type. ########## detour -end This approach of navigating an xml file is called the “XML DOM” approach: http://www.w3schools.com/dom/default.asp Another approach to querying xml data is by using “xpath”. XPath is a syntax for defining parts of an XML document. Note: the main difference between xpath and xml-dom, is that xpath only reads data, whereas xml-dom can be used to edit xml data: http://forums.codeguru.com/showthread.php?469432-How-to-run-a-very-long-SQL-statement http://stackoverflow.com/questions/16671642/dom-vs-xpath-difference You can find out more about xpath here: http://www.w3schools.com/xpath/xpath_intro.asp ######## detour – start xpath crash course: In XPath, there are seven kinds of nodes: – element – attribute – text – namespace – processing-instruction – comment – nodes ######## detour – end There is a special cmdlet in powershell, that can handle xpath notation, it is: select-xml —————————————————————————————————————————————- Special Chapter – working with xml data using the xpath syntax http://www.w3schools.com/xpath/xpath_intro.asp XPath is a syntax for defining parts of an XML document XPath uses a path-like (e.g. c:/users/desktops) syntax to drill down into xml data to get to the part you want. XPath contains a library of standard functions In xpath, there are several types of nodes (aka items): 1. Element – This is a pair of tags, with all the content inside it. 2. Attribute 3. Text 4. Namespace 5. Processing-instruction 6. Comment 7. Document E.g. : <?xml version=”1.0″ encoding=”ISO-8859-1″?=> <bookstore=> # This is an “element” node. Since this particular element houses all the other elements, it is aka as the “root element” <book=> <title lang=”en”=>Harry Potter</title=> # here, the ‘lang=”en”‘ is an attribute node. <author=>J K. Rowling</author=> # Here is another element node. <year=>2005</year=> <price=>29.99</price=> # the “29.99” is called an “atomic value”, since it has no children. “en” above is also known as an atomic value. </book=> </bookstore=> Now lets assume we have an xml file called “books.xml”, which contain’s the following xml code: <?xml version=”1.0″ encoding=”ISO-8859-1″?=> <bookstore=> <book=> <title lang=”eng”=>Harry Potter</title=> <price=>29.99</price=> </book=> <book=> <title lang=”eng”=>Learning XML</title=> <price=>39.95</price=> </book=> <otherbooks=> <usedbooks=> <book=> <title lang=”eng”=>Romeo and Juliet</title=> <price=>5.00</price=> </book=> </usedbooks=> <ebook=> <book=> <title lang=”eng”=>Lord of the Rings</title=> <price=>100.00</price=> </book=> </ebook=> </otherbooks=> </bookstore=> First we load this file into an xml-based variable object (called booksxml): PS C:\=>

$booksxml = Get-Content .\bookstore.xml

To best learn the xpath syntax, let’s try out the following examples:

eg1:
Here’s how to access the root element:

PS object method:

PS C:\=> $booksXML

xml bookstore
— ———
version=”1.0″ encoding=”ISO-8859-1″ bookstore

XPATH method:

PS C:\=> $booksXML.SelectNodes(“/”) # note, here we used to “selectnodes” method to pass the xpath string “/”

xml bookstore
— ———
version=”1.0″ encoding=”ISO-8859-1″ bookstore

eg2:
Here’s how to access the content of the bookstore element:

PS object method:

PS C:\=> $booksxml.bookstore

book usedbooks ebook
—- ——— —–
{book, book} usedbooks ebook

XPATH method:

PS C:\=> $booksXML.SelectNodes(“bookstore”) # note, here we used to selectnodes method to pass the xpath string “bookstore”

book usedbooks ebook
—- ——— —–
{book, book} usedbooks ebook

eg3:
Display all book elements throughout the xml file:

PS object method: Note sure how to do this here.

XPATH method:

PS C:\=> $booksXML.SelectNodes(“//book”) # The “//” means display all matching elements no matter where they are in the document.
# hence in this case we want to display all the “books” element.

title price
—– —–
title 29.99
title 39.95
title 5.00
title 100.00

eg4:
Display all “book” elements that are childrens of the “otherbooks” element:

PS object method: Not sure how to do this here.

XPATH method:

PS C:\=> $booksXML.SelectNodes(“/bookstore/otherbooks//book”)

title price
—– —–
title 5.00
title 100.00

eg5:
Get all values for the attribute “lang”:

PS object method: Not sure how to do this here.

XPATH method:

PS C:\=> $booksXML.SelectNodes(“//@lang”)

#text
—–
eng
eng
eng
eng

eg6:
of the 2 books, that are direct childs of the bookstore element, display the first book:

PS object method: Not sure how to do this here. Nearest thing I could do is:

PS C:\=> $booksXML.bookstore.book | Where-Object -FilterScript {$_.price -match “29.99”} # this is not good approach, because you have
# declare “29.99”

XPATh method:

PS C:\=> $booksXML.SelectNodes(“/bookstore/book[1]”)

title price
—– —–
title 29.99

You can find more example notations here:

http://www.w3schools.com/xpath/xpath_syntax.asp