3

I have word document converted to xml file, this is a part of this file:

<w:tc>
    <w:tcPr>
        <w:tcW w:w="2130" w:type="dxa"/>
    </w:tcPr>
    <w:p w:rsidR="00255D05" w:rsidRPr="00FF409F" w:rsidRDefault="00255D05" w:rsidP="00D041E7">
        <w:pPr>
            <w:rPr>
                <w:rFonts w:hint="cs"/>
                <w:sz w:val="36"/>
                <w:szCs w:val="36"/>
                <w:rtl/>
                <w:lang w:bidi="ar-JO"/>
            </w:rPr>
        </w:pPr>
        <w:r w:rsidRPr="00FF409F">
            <w:rPr>
                <w:rFonts w:hint="cs"/>
                <w:sz w:val="36"/>
                <w:szCs w:val="36"/>
                <w:rtl/>
                <w:lang w:bidi="ar-JO"/>
            </w:rPr>
            <w:t>myWantedText</w:t>
        </w:r>
    </w:p>
</w:tc>

I am trying to get the value of 'myWantedText', so far i have tried:

$xml = new SimpleXMLElement($fileContents);
foreach($xml->xpath('//w:t') as $t) {
    var_dump($t);
}

but all i am getting is a punch of object(SimpleXMLElement)[2]

2 Answers 2

2

You are lacking a namespace in the input XML and a declaration as Stuart Pointed out. Below is your XML, with the correct Word XML Namespace.

<?php

$str = <<<XML
<?xml version="1.0" standalone="yes"?>
<w:tc xmlns:w="http://schemas.microsoft.com/office/word/2003/wordml">
    <w:tcPr>
        <w:tcW w:w="2130" w:type="dxa"/>
    </w:tcPr>
    <w:p w:rsidR="00255D05" w:rsidRPr="00FF409F" w:rsidRDefault="00255D05" w:rsidP="00D041E7">
        <w:pPr>
            <w:rPr>
                <w:rFonts w:hint="cs"/>
                <w:sz w:val="36"/>
                <w:szCs w:val="36"/>
                <w:rtl/>
                <w:lang w:bidi="ar-JO"/>
            </w:rPr>
        </w:pPr>
        <w:r w:rsidRPr="00FF409F">
            <w:rPr>
                <w:rFonts w:hint="cs"/>
                <w:sz w:val="36"/>
                <w:szCs w:val="36"/>
                <w:rtl/>
                <w:lang w:bidi="ar-JO"/>
            </w:rPr>
            <w:t>myWantedText</w:t>
        </w:r>
    </w:p>
</w:tc>
XML;

$xml = new SimpleXMLElement($str);
$xml->registerXPathNamespace('w', 'http://schemas.microsoft.com/office/word/2003/wordml');
foreach($xml->xpath('//w:t') as $t) {
    var_dump($t);
}
?>

Output:

object(SimpleXMLElement)#2 (1) {
  [0]=>
  string(12) "myWantedText"
}

You can see this working here: http://codepad.org/YRIO6uk3

Sign up to request clarification or add additional context in comments.

2 Comments

I have copied and pasted your code entirely, but all i get is object(SimpleXMLElement)[2]
@alliawwad change var_dump($t); to echo $t; This is a good answer!
0

The problem is that the character ":" messes up the call to xpath.

As a workaround you can do:

<?php

$str = <<<XML
<?xml version="1.0" standalone="yes"?>
<w:tc>
    <w:tcPr>
        <w:tcW w:w="2130" w:type="dxa"/>
    </w:tcPr>
    <w:p w:rsidR="00255D05" w:rsidRPr="00FF409F" w:rsidRDefault="00255D05" w:rsidP="00D041E7">
        <w:pPr>
            <w:rPr>
                <w:rFonts w:hint="cs"/>
                <w:sz w:val="36"/>
                <w:szCs w:val="36"/>
                <w:rtl/>
                <w:lang w:bidi="ar-JO"/>
            </w:rPr>
        </w:pPr>
        <w:r w:rsidRPr="00FF409F">
            <w:rPr>
                <w:rFonts w:hint="cs"/>
                <w:sz w:val="36"/>
                <w:szCs w:val="36"/>
                <w:rtl/>
                <w:lang w:bidi="ar-JO"/>
            </w:rPr>
            <w:t>myWantedText</w:t>
        </w:r>
    </w:p>
</w:tc>
XML;

$xml = new SimpleXMLElement($str);
$result = $xml->xpath('/*');
echo $result[0]->p->r->t;
?>

OUTPUT:

myWantedText

UPDATE:
Lego's answer is better than this work around!

4 Comments

i get Trying to get property of non-object , and if i try var_dump($result[0]->p->r); i get null
Did you parse the xml using /* ?
@alliawwad I updated the answer to contain the full code (including output). Are you doing the exact same thing ?
: is perfectly valid in XML as the Namespace delimiter. Apart from the missing namespace declaration, there's nothing wrong with the XML

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.