Entrepreneur Geek

Nirav Mehta on life, technology and future

Archive for August, 2009

Twitter Weekly Updates for 2009-09-02

without comments

  • #Question What's the best tool to discard uninteresting tweets? #junk #spam #twitter #filter #uninteresting #
  • Building simple tools that solve annoying problems :) #
  • Discussing internship project with Vaibhav, Amit, Ninad & Arpit. #
  • Step 1 in being "alive": have a big goal that involves your passion #
  • Bought family pack, life time upgrades of #postbox yesterday. Migrated. Finding bugs. But looks good! http://getpostbox.com/ #

Written by Nirav

August 30th, 2009 at 10:40 pm

Posted in Updates

Tagged with

Twitter Weekly Updates for 2009-09-02

without comments

  • #Question What's the best tool to discard uninteresting tweets? #junk #spam #twitter #filter #uninteresting #
  • Building simple tools that solve annoying problems :) #
  • Discussing internship project with Vaibhav, Amit, Ninad & Arpit. #
  • Step 1 in being "alive": have a big goal that involves your passion #
  • Bought family pack, life time upgrades of #postbox yesterday. Migrated. Finding bugs. But looks good! http://getpostbox.com/ #

Written by Nirav

August 30th, 2009 at 10:40 pm

Posted in Updates

Tagged with

Twitter Weekly Updates for 2009-08-23

without comments

Written by Nirav

August 23rd, 2009 at 10:40 pm

Posted in Updates

Tagged with

Fixing bad XML, any recommendations?

without comments

I am using Text_Diff classes of PHP to generate differences between two XML documents. The output is not always valid XML – tag nesting is not always correct. This happens because my source files are XML and have their own tags. When Text_Diff inserts its own <ins> and <del> tags around the changed text, it messes up the tag hierarchy at times.

I am looking for a clean, fast and safe way to fix such invalid XML. Do you have any recommendations?

I have looked at Tidy, it’s PHP library and htmLawed. I liked htmLawed since it’s pure PHP implementation, but don’t know how fast it is compared to Tidy. Moreover, I need an XML cleaner, not necessarily XHTML cleaner. So even if I use these libraries, I will have to strip out the HTML parts from the output.

Do you have any suggestions / recommendations?

Written by Nirav

August 19th, 2009 at 5:12 pm

Posted in PHP

Tagged with , , , ,

Twitter Weekly Updates for 2009-08-16

without comments

Written by Nirav

August 16th, 2009 at 10:40 pm

Posted in Updates

Tagged with