Web authoring advice

Saturday, October 1st, 2005 10:39 pm
lethargic_man: (Default)
[personal profile] lethargic_man
I've been sort-of commissioned to make a website for Marom. Now the only complex websites I've made until now have been strictly functional ones for work, driven by Perl CGI backends and with no security measures, as behind work's firewall. I can stumble my way through simple JavaScript from knowledge of C++, but don't know any Flash.

So I have a few questions for anyone who's made a public-facing website beforehand:

Firstly, any recommendations for an host ISP? This is a site of interest probably only to a few hundred people, but those few hundred might be visiting it frequently.

Secondly, as regards making it updatable by non-techies, I was thinking of providing proformas and inclusion of "what's on this week" files, etc, by server-side includes. Are server-side includes something ISPs are likely to provide?

Thirdly, would I stand a chance of having Perl available as a backend? And if so, how do you make it secure? I gather taint-checking is involved; once I've taken that into account, can I leave it all to Perl? Actually, I probably don't need Perl for the basic proposition, but when I was talking about it with Assael, he waxed lyrical about the possibilities -- message boards, a dating subsite, the works -- and it would be nice to be able to extend the site after its initial launch.

Any other advice whilst I'm at it?

Date: 2005-10-02 12:36 am (UTC)
ext_8103: (Default)
From: [identity profile] ewx.livejournal.com

Anyone offering "CGI" is almost certain to offer Perl.

Perl's taint mode is neither necessary nor sufficient; i.e. you can do without it (and many do), and even if you do use it you don't have any guarantee of security (you do have a guarantee of inconvenience, though).

You should think about how data you are passing across any interface will be interpreted. For instance anything you pass to the single-arg form of system will be treated as a shell command, with all the interesting bits of syntax that implies. You don't want your users to be able to insert a raw ; or ``, or any of a number of other characters, into a shell command you execute.

There are several approaches to dealing with this kind of thing. The two main ones are sanitization and quoting. The former means limiting the set of characters passed to those known to be safe: that is to say you remove or reject all characters not known to be safe. The latter means passing any character but quoting it appropriately for the interface in question so that the shell (or whatever) will not treat it as starting a new command or something. In the case of shell commands Perl's \Q operator is handy here.

I prefer the quoting approach, but both have their place.

(Hopefuly you've noticed that I've phrased things in terms of rejecting/removing/quoting characters not known to be safe, rather than rejecting those known to be unsafe.)

You should understand the interfaces you use. In Perl system is an obvious danger point as is ``; but open can take shell commands too, so if the caller has any influence at all over choice of filenames then you must be incredibly careful - if the client controls the first or last character of the “filename” for instance then they can potentially arrange for the command of their choice to be executed. Another trap with user-controlled filenames is using .. to escape into a parent directory - perhaps via a child directory! - and access some file that ought not to be available.

Changes of encoding used to be a classic source of error - e.g. decoding URL-encoding twice. Re-emitting input data to output without suitable SGML (or whatever) quoting is very another common error (a problem with this being that it lets an attacker “run” arbitrary HTML, perhaps with embedded scripts, with the victim's privileges, provided they can trick the victim into visiting a carefuly chosen URL). Format strings vulnerabilities (where you let the user control the first arg to printf or similar) are more commonly found in C programs, and probably aren't as evil in Perl as in C, but could in principle at least cause problems in Perl programs.

The idea of taint checking is that every bit of data received from an “untrusted” source is marked as tainted and any attempt to pass it across an interface where a badly chosen value could cause unexpected effects provokes an error. There are rules about which operations keep and lose the taint. I'm not really a fan of this approach myself, and it's noticable that taint checking seems to be very rarely used - I think in practice it's just not as useful as it sounds.

About the one thing you don't have to worry about in Perl are buffer overruns. But there are a large number of other potential disasters.

Date: 2005-10-02 10:20 am (UTC)
From: [identity profile] lethargic-man.livejournal.com
(Hopefuly you've noticed that I've phrased things in terms of rejecting/removing/quoting characters not known to be safe, rather than rejecting those known to be unsafe.)

Well I certainly have now. :o)

About the one thing you don't have to worry about in Perl are buffer overruns. But there are a large number of other potential disasters.

Ulp. Maybe I'll get someone with knowledge of insecurities to look over my scripts once I've written them.

But thanks for your advice anyway!

Date: 2005-10-02 03:24 pm (UTC)
From: [identity profile] pseudomonas.livejournal.com
I find taint checking very useful. It's not foolproof, but it is good at enforcing discipline (rather than providing security in itself) and keeping an eye on where variables have come from. Apart from that, I agree with your comments.

I'd also suggest reading up a little about SQL injection attacks if you're using any databases as a back-end to anything.

Date: 2006-05-26 01:07 pm (UTC)
From: [identity profile] lethargic-man.livejournal.com
Okay, I've now going through my taint-safe forms checking they're otherwise secure, and I've a few more questions.

You should think about how data you are passing across any interface will be interpreted. For instance anything you pass to the single-arg form of system will be treated as a shell command, with all the interesting bits of syntax that implies. You don't want your users to be able to insert a raw ; or ``, or any of a number of other characters, into a shell command you execute.

Okay, my scripts are doing one of two things with their input. I have one script which uses the CGI params passed to it (by other links on the site) to call an off-site URL (what it's doing is rebranding a Yahoo Calendar to fit the site). This merely passes the parameters on to a LWP::UserAgent call to Yahoo, and reports them back to the user in the form of a link to the linked-to site. I would imagine this does not need protecting, as the user could not get control of anything here bar what is passed on to the Yahoo site.

The other thing I'm doing is parsing a webform by substituting the filled in field values and emailing it on (with fixed email headers) as an HTML page. Here I strip out angle brackets and script tags before sending it on. I'm using Mailer::Sender (actually Mailer::Sender::Easy), which connects direct to the mail server via a socket, to do the mailing. Do you know what the vulnerabilities of this are? Do I need to do any further substitution, like you suggest above; or worry about anything that this might do, in re buffer overruns or anything?

One of my webforms allows uploading of a file. I'm using code I found on the Web (http://perlmeme.org/tutorials/cgi_upload.html) to do so in a taint-safe manner, but is there anything else I should worry about in this regard?

Date: 2006-05-26 01:29 pm (UTC)
From: [identity profile] pseudomonas.livejournal.com
Isn't that against the Yahoo TOS? That sort of thing usually is.

As for the webform, it'll depend on how you're parsing the input. Better than stripping out known bad stuff is only allowing known good stuff.

Perl doesn't tend to have buffer-overrun problems very much, its problems are much more to do with its tendency to eval all sorts of things.

Date: 2006-05-26 01:41 pm (UTC)
From: [identity profile] lethargic-man.livejournal.com
Isn't that against the Yahoo TOS? That sort of thing usually is.

Depends how anal they're being, I suppose. It's no different, really, from presenting a syndicated RSS feed that looks different to its originating page (which I'm also doing). It's not that I'm trying to pass off Yahoo's work as my own: I'm providing a link to the original site, and including the Yahoo copyright notice, complete with hyperlinks to the terms of service and privacy policy.

Where do you draw the line? If you regard that as not okay, what if I'd got Yahoo's site in a frame on another site, with other content around it? Or an iframe?

OTOH, maybe I should get advice on this before I make the site live...

Date: 2006-05-26 01:42 pm (UTC)
From: [identity profile] pseudomonas.livejournal.com
The main question is probably if you're stripping out their advertising.

Date: 2006-05-28 08:37 am (UTC)
From: [identity profile] lethargic-man.livejournal.com
There is no advertising on the main Yahoo calendar page, just links to other parts of Yahoo.

Date: 2006-05-28 09:13 am (UTC)
From: [identity profile] lethargic-man.livejournal.com
Isn't that against the Yahoo TOS? That sort of thing usually is.

This is crazy. I've just been looking at the ToS for Blogspot, and Flickr, which I shall also be remunging in this way. The Blogspot ToS (http://www.blogger.com/terms.g) state:
11. Pyra PROPRIETARY RIGHTS You acknowledge and agree that the Service and any necessary software used in connection with the Service ("Software") contain proprietary and confidential information that is protected by applicable intellectual property and other laws. [...] Except as expressly authorized by Pyra or advertisers, you agree not to modify, rent, lease, loan, sell, distribute or create derivative works based on the Service or the Software, in whole or in part.
Which seems clear enough... and yet they provide an Atom feed for each blog, which expressly seems to encourage such behaviour! Moreover the ToS are not referenced in the Atom feed XML (http://maromuk.blogspot.com/atom.xml) or the "About Atom feeds"</>A page, nor do the ToS refer to syndication at all!

With Flickr, it's even crazier. Flickr's been bought by Yahoo. The Flickr ToS (http://www.flickr.com/terms.gne) page links to both Flickr's and Yahoo's ToS. The Flickr page says
The Flickr service makes it possible to post images hosted on Flickr to outside websites. This use is accepted (and even encouraged!). However, pages on other websites which display images hosted on flickr.com must provide a link back to Flickr from each photo to its photo page on Flickr.
However, the Yahoo ToS (http://docs.yahoo.com/info/terms/)—the same as started off this subthread—say:
17. YAHOO!'S PROPRIETARY RIGHTS

You acknowledge and agree that the Service and any necessary software used in connection with the Service ("Software") contain proprietary and confidential information that is protected by applicable intellectual property and other laws. [...] Except as expressly authorized by Yahoo! or advertisers, you agree not to modify, [...] distribute or create derivative works based on the Service or the Software, in whole or in part.

Yahoo! grants you a personal, non-transferable and non-exclusive right and license to use the object code of its Software on a single computer; provided that you do not (and do not allow any third party to) copy, modify, create a derivative work from, [...] or otherwise attempt to discover any source code [...] or otherwise transfer any right in the Software. You agree not to modify the Software in any manner or form, nor to use modified versions of the Software[...]. You agree not to access the Service by any means other than through the interface that is provided by Yahoo! for use in accessing the Service.
And yet Flickr provides a syndication feed too! (I suppose the "interface that is provided by Yahoo! for us" could get a get-out clause for the syndication feed...)
(http://help.blogger.com/bin/answer.py?answer=697)

Date: 2006-05-27 10:08 pm (UTC)
ext_8103: (Default)
From: [identity profile] ewx.livejournal.com

The linked script allows a user who has a login on the system the CGI runs on to (over-)write any filename that the CGI can write to.

Stripping out script tags from HTML is doomed to failure unless you do a full SGML parse and reconstruct the message using only known-good elements and conservative quoting, since you cannot possibly hope to take account of the quirks everyone else's HTML parsers.

Date: 2006-05-27 10:22 pm (UTC)
From: [identity profile] lethargic-man.livejournal.com
Thanks for your help again.

The linked script allows a user who has a login on the system the CGI runs on to (over-)write any filename that the CGI can write to.

Good point, though not relevant as I'm not using the save-back-to-disk part of the script.

Stripping out script tags from HTML is doomed to failure unless you do a full SGML parse and reconstruct the message using only known-good elements and conservative quoting, since you cannot possibly hope to take account of the quirks everyone else's HTML parsers.

Does changing all angle brackets into HTML entities such as I have in my script count as conservative quoting in this instance?

Date: 2006-05-27 10:43 pm (UTC)
ext_8103: (Default)
From: [identity profile] ewx.livejournal.com
Yes, but if you're going to do that, I'm not sure why you want HTML input in the first place?

Date: 2006-05-28 08:38 am (UTC)
From: [identity profile] lethargic-man.livejournal.com
I don't; I want plaintext. Only the plaintext is getting interpolated into the HTML in place of the webform for forwarding on. The end-user doesn't know that, but I can't risk them second-guessing it.

Date: 2006-05-28 10:24 am (UTC)
ext_8103: (Default)
From: [identity profile] ewx.livejournal.com

Ah, right. I'm unclear at what point you're substituting into HTML, but whatever point that is is the point you should SGML quote it, and not before.

I would:

  • Convert at least < and & into entities;
  • Strip out control characters (0-8, 11-31, 127-159);
  • Convert anything outside ASCII range into numeric character references, avoiding the need to know what encoding the output uses

Obviously you need to know what the input encoding is in order to be able to correctly interpret any bytes outside the 0-127 range. DisOrder uses <form ... enctype="multipart/form-data" accept-charset=utf-8> to request that the input be UTF-8.

Once you have encoded every character with special meaning, there is no need to delete script tags, since there's no possibility of anything being interpreted as a tag any more.

Date: 2006-05-28 11:03 am (UTC)
From: [identity profile] lethargic-man.livejournal.com
Great, thanks.

Date: 2005-10-02 03:27 pm (UTC)
From: [identity profile] pseudomonas.livejournal.com
Don't bother with Flash for anything that you can do any other way at all. You'll only have to do it all twice for the benefit of people without Flash (and keep the two versions synched).

Date: 2005-10-02 03:44 pm (UTC)
From: [identity profile] lethargic-man.livejournal.com
I hadn't intended to, but the site I've been recommended to base mine on is completely Flash-based.

Date: 2005-10-11 09:06 am (UTC)
From: [identity profile] snjstar.livejournal.com
Good luck

Host ISP Bug

Date: 2018-03-07 06:59 am (UTC)
From: (Anonymous)
There are a lot of trick in Google and YouTube. All are not working. But in this Blog article, You can archive proper knowledge.

https://isphack.blogspot.com/2018/02/how-to-find-working-host-for-free.html

Is Hack of any Local ISP!

Date: 2018-03-07 07:02 am (UTC)
From: (Anonymous)
Great information.

Finding ISP Host and BUG to get Free Internet (https://isphack.blogspot.com/2018/02/how-to-find-working-host-for-free.html)

[url=https://isphack.blogspot.com/2018/02/how-to-find-working-host-for-free.html]Find Best Tips to Get ISP Host Bug[/url]

Profile

lethargic_man: (Default)
Lethargic Man (anag.)

February 2026

S M T W T F S
1234567
8 91011121314
15161718192021
22232425262728

Most Popular Tags

Style Credit

Expand Cut Tags

No cut tags
Page generated Saturday, February 28th, 2026 02:55 am
Powered by Dreamwidth Studios