Skip to main content
Use code formatting for verbatim input/output examples
Source Link
AdminBee
  • 23.6k
  • 25
  • 55
  • 77

would give me: <meta name="description" content="terreno de 936 m², Terreno en venta en paseo Blasco Ibáñez s/n, Costa Esuri, Ayamonte, Costa Esuri">

<meta name="description" content="terreno de 936 m², Terreno en venta en paseo Blasco Ibáñez s/n, Costa Esuri, Ayamonte, Costa Esuri">

In an attempt to scrape just the text, without any htmlHTML, I trialled applying sedsed and I ended up with this code that works as expected:

terreno de 936 m², Terreno en venta en paseo Blasco Ibáñez s/n, Costa Esuri, Ayamonte, Costa Esuri

terreno de 936 m², Terreno en venta en paseo Blasco Ibáñez s/n, Costa Esuri, Ayamonte, Costa Esuri 

wheras it gives me the untrimmed <meta name="description" content="terreno de 936 m², Terreno en venta en paseo Blasco Ibáñez s/n, Costa Esuri, Ayamonte, Costa Esuri">.

<meta name="description" content="terreno de 936 m², Terreno en venta en paseo Blasco Ibáñez s/n, Costa Esuri, Ayamonte, Costa Esuri">

One may need to go to https://www.idealista.com/inmueble/94238881/  , then open developer tools in their browser and copy as cURL in order to play with this example.

would give me: <meta name="description" content="terreno de 936 m², Terreno en venta en paseo Blasco Ibáñez s/n, Costa Esuri, Ayamonte, Costa Esuri"> In an attempt to scrape just the text, without any html, I trialled applying sed and I ended up with this code that works as expected:

terreno de 936 m², Terreno en venta en paseo Blasco Ibáñez s/n, Costa Esuri, Ayamonte, Costa Esuri

wheras it gives me the untrimmed <meta name="description" content="terreno de 936 m², Terreno en venta en paseo Blasco Ibáñez s/n, Costa Esuri, Ayamonte, Costa Esuri">. One may need to go to https://www.idealista.com/inmueble/94238881/  , then open developer tools in their browser and copy as cURL in order to play with this example.

would give me:

<meta name="description" content="terreno de 936 m², Terreno en venta en paseo Blasco Ibáñez s/n, Costa Esuri, Ayamonte, Costa Esuri">

In an attempt to scrape just the text, without any HTML, I trialled applying sed and I ended up with this code that works as expected:

terreno de 936 m², Terreno en venta en paseo Blasco Ibáñez s/n, Costa Esuri, Ayamonte, Costa Esuri 

wheras it gives me the untrimmed

<meta name="description" content="terreno de 936 m², Terreno en venta en paseo Blasco Ibáñez s/n, Costa Esuri, Ayamonte, Costa Esuri">

One may need to go to https://www.idealista.com/inmueble/94238881/, then open developer tools in their browser and copy as cURL in order to play with this example.

edited tags
Link
John Smith
  • 827
  • 7
  • 25
added 314 characters in body
Source Link
John Smith
  • 827
  • 7
  • 25

My question is why doesn't sed -E 's;^.*(content=\")\">$;;' work? It was meant to give me this result:

terreno de 936 m², Terreno en venta en paseo Blasco Ibáñez s/n, Costa Esuri, Ayamonte, Costa Esuri

wheras it gives me the untrimmed <meta name="description" content="terreno de 936 m², Terreno en venta en paseo Blasco Ibáñez s/n, Costa Esuri, Ayamonte, Costa Esuri">. One may need to go to https://www.idealista.com/inmueble/94238881/ , then open developer tools in their browser and copy as cURL in order to play with this example.

My question is why doesn't sed -E 's;^.*(content=\")\">$;;' work? One may need to go to https://www.idealista.com/inmueble/94238881/ , then open developer tools in their browser and copy as cURL in order to play with this example.

My question is why doesn't sed -E 's;^.*(content=\")\">$;;' work? It was meant to give me this result:

terreno de 936 m², Terreno en venta en paseo Blasco Ibáñez s/n, Costa Esuri, Ayamonte, Costa Esuri

wheras it gives me the untrimmed <meta name="description" content="terreno de 936 m², Terreno en venta en paseo Blasco Ibáñez s/n, Costa Esuri, Ayamonte, Costa Esuri">. One may need to go to https://www.idealista.com/inmueble/94238881/ , then open developer tools in their browser and copy as cURL in order to play with this example.

Source Link
John Smith
  • 827
  • 7
  • 25
Loading