Dynamically looping XPath to check for child types [on hold]
$begingroup$
I'm parsing large HTML documents and pulling out certain values. They all share a common top path; however, I want to add logic within the loop to determine the appropriate parsing logic.
$xpath = new DOMXpath($doc);
$data_array = array();
$table_nodes = $xpath->query("///html/body/div/div[1]/div[2]/table/tbody/tr/td/center/table[1]/tbody/tr/td/table[2]/tbody/tr/th/*");
foreach($table_nodes as $table_node) {
// Check if date exists
$xpath_query_date = trim($xpath->evaluate("string(./tbody/tr/td[1]/table/tbody/tr[1]/td/table/tbody/tr[1]/td[1])", $table_node));
if(isset($xpath_query_date)) {
$data_array['date'] = trim($xpath_query_date);
}
// Check if pricing exists
$xpath_query_pricing = trim($xpath->evaluate("string(./tbody/tr[5]/td[3]/span)", $table_node));
if(isset($xpath_query_pricing)) {
$data_array['pricing'] = $xpath_query_pricing;
}
...
}
The example above evaluates a path to a couple of the values I need.
However, I need to check about 15 values and I'm using the loop because I don't know the position of where the values sit in each document. Meaning I don't know which parent /tbody/tr/td[1]/table/tbody/tr[1]/td/table/tbody/tr[1]/td[1]
will appear, for example.
How can I more easily check whether the value I need exists within the loop?
Update
There's really nothing more to the document than beyond what's posted here. Values that I need to parse have an XPath's such as
/html/body/div/div[1]/div[2]/table/tbody/tr/td/center/table[1]/tbody/tr/td/table[2]/tbody/tr/th/table[7]/tbody/tr[1]/td[1]/table/tbody/tr[1]/td/table/tbody/tr[1]/td[1]
However, the table
index may differ (table[7]
vs table[8]
) between documents, even though it's the same entity.
/html/body/div/div[1]/div[2]/table/tbody/tr/td/center/table[1]/tbody/tr/td/table[2]/tbody/tr/th/table[8]/tbody/tr[1]/td[1]/table/tbody/tr[1]/td/table/tbody/tr[1]/td[1]
Therefore, I need to check the tail part of the query.
php xpath
New contributor
$endgroup$
put on hold as off-topic by Mast, Ludisposed, Sᴀᴍ Onᴇᴌᴀ, Toby Speight, Zeta 13 hours ago
This question appears to be off-topic. The users who voted to close gave this specific reason:
- "Lacks concrete context: Code Review requires concrete code from a project, with sufficient context for reviewers to understand how that code is used. Pseudocode, stub code, hypothetical code, obfuscated code, and generic best practices are outside the scope of this site." – Mast, Ludisposed, Sᴀᴍ Onᴇᴌᴀ, Toby Speight, Zeta
If this question can be reworded to fit the rules in the help center, please edit the question.
add a comment |
$begingroup$
I'm parsing large HTML documents and pulling out certain values. They all share a common top path; however, I want to add logic within the loop to determine the appropriate parsing logic.
$xpath = new DOMXpath($doc);
$data_array = array();
$table_nodes = $xpath->query("///html/body/div/div[1]/div[2]/table/tbody/tr/td/center/table[1]/tbody/tr/td/table[2]/tbody/tr/th/*");
foreach($table_nodes as $table_node) {
// Check if date exists
$xpath_query_date = trim($xpath->evaluate("string(./tbody/tr/td[1]/table/tbody/tr[1]/td/table/tbody/tr[1]/td[1])", $table_node));
if(isset($xpath_query_date)) {
$data_array['date'] = trim($xpath_query_date);
}
// Check if pricing exists
$xpath_query_pricing = trim($xpath->evaluate("string(./tbody/tr[5]/td[3]/span)", $table_node));
if(isset($xpath_query_pricing)) {
$data_array['pricing'] = $xpath_query_pricing;
}
...
}
The example above evaluates a path to a couple of the values I need.
However, I need to check about 15 values and I'm using the loop because I don't know the position of where the values sit in each document. Meaning I don't know which parent /tbody/tr/td[1]/table/tbody/tr[1]/td/table/tbody/tr[1]/td[1]
will appear, for example.
How can I more easily check whether the value I need exists within the loop?
Update
There's really nothing more to the document than beyond what's posted here. Values that I need to parse have an XPath's such as
/html/body/div/div[1]/div[2]/table/tbody/tr/td/center/table[1]/tbody/tr/td/table[2]/tbody/tr/th/table[7]/tbody/tr[1]/td[1]/table/tbody/tr[1]/td/table/tbody/tr[1]/td[1]
However, the table
index may differ (table[7]
vs table[8]
) between documents, even though it's the same entity.
/html/body/div/div[1]/div[2]/table/tbody/tr/td/center/table[1]/tbody/tr/td/table[2]/tbody/tr/th/table[8]/tbody/tr[1]/td[1]/table/tbody/tr[1]/td/table/tbody/tr[1]/td[1]
Therefore, I need to check the tail part of the query.
php xpath
New contributor
$endgroup$
put on hold as off-topic by Mast, Ludisposed, Sᴀᴍ Onᴇᴌᴀ, Toby Speight, Zeta 13 hours ago
This question appears to be off-topic. The users who voted to close gave this specific reason:
- "Lacks concrete context: Code Review requires concrete code from a project, with sufficient context for reviewers to understand how that code is used. Pseudocode, stub code, hypothetical code, obfuscated code, and generic best practices are outside the scope of this site." – Mast, Ludisposed, Sᴀᴍ Onᴇᴌᴀ, Toby Speight, Zeta
If this question can be reworded to fit the rules in the help center, please edit the question.
1
$begingroup$
Essential parts of your code are missing. Is this all part of a function? Part of a larger script? How is data handled in the rest of your program? What happens in the rest of theforeach
? What does the document look like? It's hard to tell what the easiest method is if we don't see most of what's going on.
$endgroup$
– Mast
16 hours ago
add a comment |
$begingroup$
I'm parsing large HTML documents and pulling out certain values. They all share a common top path; however, I want to add logic within the loop to determine the appropriate parsing logic.
$xpath = new DOMXpath($doc);
$data_array = array();
$table_nodes = $xpath->query("///html/body/div/div[1]/div[2]/table/tbody/tr/td/center/table[1]/tbody/tr/td/table[2]/tbody/tr/th/*");
foreach($table_nodes as $table_node) {
// Check if date exists
$xpath_query_date = trim($xpath->evaluate("string(./tbody/tr/td[1]/table/tbody/tr[1]/td/table/tbody/tr[1]/td[1])", $table_node));
if(isset($xpath_query_date)) {
$data_array['date'] = trim($xpath_query_date);
}
// Check if pricing exists
$xpath_query_pricing = trim($xpath->evaluate("string(./tbody/tr[5]/td[3]/span)", $table_node));
if(isset($xpath_query_pricing)) {
$data_array['pricing'] = $xpath_query_pricing;
}
...
}
The example above evaluates a path to a couple of the values I need.
However, I need to check about 15 values and I'm using the loop because I don't know the position of where the values sit in each document. Meaning I don't know which parent /tbody/tr/td[1]/table/tbody/tr[1]/td/table/tbody/tr[1]/td[1]
will appear, for example.
How can I more easily check whether the value I need exists within the loop?
Update
There's really nothing more to the document than beyond what's posted here. Values that I need to parse have an XPath's such as
/html/body/div/div[1]/div[2]/table/tbody/tr/td/center/table[1]/tbody/tr/td/table[2]/tbody/tr/th/table[7]/tbody/tr[1]/td[1]/table/tbody/tr[1]/td/table/tbody/tr[1]/td[1]
However, the table
index may differ (table[7]
vs table[8]
) between documents, even though it's the same entity.
/html/body/div/div[1]/div[2]/table/tbody/tr/td/center/table[1]/tbody/tr/td/table[2]/tbody/tr/th/table[8]/tbody/tr[1]/td[1]/table/tbody/tr[1]/td/table/tbody/tr[1]/td[1]
Therefore, I need to check the tail part of the query.
php xpath
New contributor
$endgroup$
I'm parsing large HTML documents and pulling out certain values. They all share a common top path; however, I want to add logic within the loop to determine the appropriate parsing logic.
$xpath = new DOMXpath($doc);
$data_array = array();
$table_nodes = $xpath->query("///html/body/div/div[1]/div[2]/table/tbody/tr/td/center/table[1]/tbody/tr/td/table[2]/tbody/tr/th/*");
foreach($table_nodes as $table_node) {
// Check if date exists
$xpath_query_date = trim($xpath->evaluate("string(./tbody/tr/td[1]/table/tbody/tr[1]/td/table/tbody/tr[1]/td[1])", $table_node));
if(isset($xpath_query_date)) {
$data_array['date'] = trim($xpath_query_date);
}
// Check if pricing exists
$xpath_query_pricing = trim($xpath->evaluate("string(./tbody/tr[5]/td[3]/span)", $table_node));
if(isset($xpath_query_pricing)) {
$data_array['pricing'] = $xpath_query_pricing;
}
...
}
The example above evaluates a path to a couple of the values I need.
However, I need to check about 15 values and I'm using the loop because I don't know the position of where the values sit in each document. Meaning I don't know which parent /tbody/tr/td[1]/table/tbody/tr[1]/td/table/tbody/tr[1]/td[1]
will appear, for example.
How can I more easily check whether the value I need exists within the loop?
Update
There's really nothing more to the document than beyond what's posted here. Values that I need to parse have an XPath's such as
/html/body/div/div[1]/div[2]/table/tbody/tr/td/center/table[1]/tbody/tr/td/table[2]/tbody/tr/th/table[7]/tbody/tr[1]/td[1]/table/tbody/tr[1]/td/table/tbody/tr[1]/td[1]
However, the table
index may differ (table[7]
vs table[8]
) between documents, even though it's the same entity.
/html/body/div/div[1]/div[2]/table/tbody/tr/td/center/table[1]/tbody/tr/td/table[2]/tbody/tr/th/table[8]/tbody/tr[1]/td[1]/table/tbody/tr[1]/td/table/tbody/tr[1]/td[1]
Therefore, I need to check the tail part of the query.
php xpath
php xpath
New contributor
New contributor
edited 15 hours ago
Kermit
New contributor
asked 17 hours ago
KermitKermit
1023
1023
New contributor
New contributor
put on hold as off-topic by Mast, Ludisposed, Sᴀᴍ Onᴇᴌᴀ, Toby Speight, Zeta 13 hours ago
This question appears to be off-topic. The users who voted to close gave this specific reason:
- "Lacks concrete context: Code Review requires concrete code from a project, with sufficient context for reviewers to understand how that code is used. Pseudocode, stub code, hypothetical code, obfuscated code, and generic best practices are outside the scope of this site." – Mast, Ludisposed, Sᴀᴍ Onᴇᴌᴀ, Toby Speight, Zeta
If this question can be reworded to fit the rules in the help center, please edit the question.
put on hold as off-topic by Mast, Ludisposed, Sᴀᴍ Onᴇᴌᴀ, Toby Speight, Zeta 13 hours ago
This question appears to be off-topic. The users who voted to close gave this specific reason:
- "Lacks concrete context: Code Review requires concrete code from a project, with sufficient context for reviewers to understand how that code is used. Pseudocode, stub code, hypothetical code, obfuscated code, and generic best practices are outside the scope of this site." – Mast, Ludisposed, Sᴀᴍ Onᴇᴌᴀ, Toby Speight, Zeta
If this question can be reworded to fit the rules in the help center, please edit the question.
1
$begingroup$
Essential parts of your code are missing. Is this all part of a function? Part of a larger script? How is data handled in the rest of your program? What happens in the rest of theforeach
? What does the document look like? It's hard to tell what the easiest method is if we don't see most of what's going on.
$endgroup$
– Mast
16 hours ago
add a comment |
1
$begingroup$
Essential parts of your code are missing. Is this all part of a function? Part of a larger script? How is data handled in the rest of your program? What happens in the rest of theforeach
? What does the document look like? It's hard to tell what the easiest method is if we don't see most of what's going on.
$endgroup$
– Mast
16 hours ago
1
1
$begingroup$
Essential parts of your code are missing. Is this all part of a function? Part of a larger script? How is data handled in the rest of your program? What happens in the rest of the
foreach
? What does the document look like? It's hard to tell what the easiest method is if we don't see most of what's going on.$endgroup$
– Mast
16 hours ago
$begingroup$
Essential parts of your code are missing. Is this all part of a function? Part of a larger script? How is data handled in the rest of your program? What happens in the rest of the
foreach
? What does the document look like? It's hard to tell what the easiest method is if we don't see most of what's going on.$endgroup$
– Mast
16 hours ago
add a comment |
0
active
oldest
votes
0
active
oldest
votes
0
active
oldest
votes
active
oldest
votes
active
oldest
votes
1
$begingroup$
Essential parts of your code are missing. Is this all part of a function? Part of a larger script? How is data handled in the rest of your program? What happens in the rest of the
foreach
? What does the document look like? It's hard to tell what the easiest method is if we don't see most of what's going on.$endgroup$
– Mast
16 hours ago