htmlcutstring python package

20 07 2009

I released php, javascript implementation for cut html string, these programs cut the html string by keeping html tags as it is. Now I released same in python with the name htmlcutstring. Check this at http://pypi.python.org/pypi/htmlcutstring/1.0 .

It is easy to extract an excerpt of a text string with a given length limit. But if you want to extract an excerpt from HTML, the tags that may exist in the text string make it more complicated.

This module provides a solution to extract excerpts from HTML documents with a given text length limit without counting the length of any HTML tags.

This module is used to cut the string which is having html tags. It does not count the html tags, it just count the string inside tags and keeps the tags as it is.

ex: If the string is “welcome to <b>Python World</b> <br> Python is bla”. and If we want to cut the string of 16 charaters then output will be “welcome to <b>Python</b>”.

Here while cutting the string it keeps the tags for the cutting string and skip the rest and without distorbing the div structure.

USAGE1:
obj = HtmlCutString("welcome to <b>Python World</b> <br> Python is",16)
newCutString = obj.cut()

USAGE2:
newCutString = cutHtmlString("welcome to <b>Python World</b> <br> Python is",16)




Cut html string by keeping html tags as it is

12 04 2009

With the use of WYSIWYG editor user can enter html content. While displaying teaser(short content) of the content on home page or dashboard it is hard to cut the string properly. Regarding this I did not see any available code. I need to do this in javascript and PHP. I implemented same logic in both javascript and php. I hope it would be helpful for others. The code for javascript and php is available at http://code.google.com/p/cut-html-string/

Using PHP code

You can download PHP code from this link
Download code and extract it. Include the cutstring.php file and use like this

$data = "<span style='color:green;'>aa<em>BB</em><ul><li>one</li><li>two</li><li>three</li></ul></span>";
$wanted_count = 10;
$cutstrObj = new HtmlCutString($data,$wanted_count);
$new_string = $cutstrObj->cut();
echo $new_string;

Update: We can directly use cut_html_string function

$data = "<span style='color:green;'>aa<em>BB</em><ul><li>one</li><li>two</li><li>three</li></ul></span>";
$wanted_count = 10;
$new_string = cut_html_string($data,$wanted_count);
echo $new_string;

Update 2: Above code will output the following string

<span style="color:green;">aa<em>BB</em><ul><li>one</li><li>two</li></ul></span>

Using Javascript code

You can downlod javascript code from this link
Download code and extract it. Include the cutstring.js file in your html file and use like this.

var data = '<span style="font-weight:bold">aa<em>BB</em></span><span id="1">Test <b>Cutstring</b></span> bb';
var wanted_count = 10;
var cutstrObj = new CutString(data,wanted_count);
var newStr = cutstrObj.cut();
console.log(newStr);

update: Now we can directly use cutHtmlString function

var data = '<span style="font-weight:bold">aa<em>BB</em></span><span id="1">Test <b>Cutstring</b></span> bb';
var wanted_count = 10;
var newStr = cutHtmlString(data, wanted_count);
console.log(newStr);

Update 2: Above code will log the following output

<span style="font-weight: bold;">aa<em>BB</em></span><span id="1">Test <strong>C</strong></span>

First I implemented it easily in javascript using DOM. But it took lot of time in PHP to implement same logic. Finally I implemented same logic using DOMDocument in php.