?

Log in

No account? Create an account
 

i have some questions - Web Developer

About i have some questions

Previous Entry i have some questions Mar. 12th, 2005 @ 04:58 pm Next Entry
i'm building my own blogging system for my website, and i was interested in having a bit of security when it comes to code in the comments.


  • disallow html code

  • disallow javascript

  • special tags (standard BBcode - [quote][url][img])

  • keep it xhtml1.1 compliant



so, i have achieved the first 3 points, but i was wondering which php functions would you recommend for filtering the comments text?

plus, is there any way to garantee that the document keeps xhtml compliant?


(cross-posted to webdev and php)
Leave a comment
From:nodrew
Date:March 12th, 2005 05:22 pm (UTC)
(Link)
As long as you diasallow people from posting HTML code in the comments, you should have absolutely no problems keeping it XHTML complient... assuming, of course, that the code you wrote is complient.

What do you mean by "filtering the comments text"? If you mean as in keeping out naughty words and such.. probably just work with regular expressions. They're abit tricky to learn, but very powerful. I'm guessing, however if you've achieved the first the points (specifically #3), then you probably are already using regular expressions anyways.

Give more details on what type of filtering you're thinking of doing and I'll see what I can come up with. :)
From:andr3
Date:March 12th, 2005 05:33 pm (UTC)
(Link)
well, that was exactly why i posted this.

i'm using really basic functions to disable html. and converting those special bbcode tags into the code i want to. [url] to anchor tags, etc.

that's what i mean by filtering. not the naughty words ;) i'm fine with that.

i'm not familiar with regexp, although i do want to get into them at some point. cause i know their power.

let me tell you how i've achieved those points, in a very basic way.

used html_entities() to disable any html tags. (which disabled js as well)
used nl2br to create breaklines
used str_replace() to replace [quote] with <div class="quote">, etc.

but i'm not happy with this way.. i find it too basic, and probably, too faulty. i don't have much experience with php security, so i was wondering what you guys would use in this case.
[User Picture Icon]
From:ceejayoz
Date:March 12th, 2005 05:37 pm (UTC)
(Link)
used str_replace() to replace [quote] with
, etc.

Doing that'll let people break your pages by forgetting to do the [/quote] tag at the end.

That's why you need regular expressions with preg_replace.
From:andr3
Date:March 12th, 2005 05:34 pm (UTC)
(Link)
and thanks in advance for any help

(forgot to include this in the other post)
From:kw34hd1
Date:March 12th, 2005 05:36 pm (UTC)
(Link)
bbcode is dumb when there are so many html parsers out there. just write a function that strips out all disallowed html tags.

-j
[User Picture Icon]
From:ceejayoz
Date:March 12th, 2005 05:38 pm (UTC)
(Link)
Or use a better system like Textile.
From:vitiate_elysium
Date:March 12th, 2005 05:54 pm (UTC)
(Link)
Agreed. Whenver I'm looking to allow only *some* HTML I just use strip_tags().
From:kw34hd1
Date:March 13th, 2005 03:03 am (UTC)
(Link)
strip_tags() is extremely unsafe.

-j
From:vitiate_elysium
Date:March 13th, 2005 03:27 am (UTC)
(Link)
ok?
From:bjou
Date:March 13th, 2005 04:05 am (UTC)
(Link)
strip_tags() has been binary safe for a while. what are you talking about? security; as in...hacking safe? it should be ok there too.

what kind of an alternative would you suggest, other than the ones previosly mentioned in this post?
[User Picture Icon]
From:bunnyhero
Date:March 13th, 2005 05:53 am (UTC)
(Link)
i believe strip_tags is unsafe because it doesn't strip the attributes of allowed tags. here's a warning from the documentation:
This function does not modify any attributes on the tags that you allow using allowable_tags, including the style and onmouseover attributes that a mischievous user may abuse when posting text that will be shown to other users.

From:vitiate_elysium
Date:March 13th, 2005 01:18 pm (UTC)
(Link)
Yeah, and that's an understandable argument. But I would hardly call that "extremely unsafe." Pretty much any function, depending on how it's used, could be considered "extremely unsafe." print() is extremely unsafe if you publicly print the root password for your box (:
From:andr3
Date:March 13th, 2005 03:43 pm (UTC)
(Link)
i see... so maybe i could run a home-made function to grab all the tags that survived strip_tags() and either erasing those onmouseover and style atributes or simply replacing the words for any other, thus, rendering them useless.

i am going to look into this later today, when i get home.

thank you all for your feedback.
(Leave a comment)
Top of Page Powered by LiveJournal.com