[ python-Bugs-649967 ] urllib.urlopen('file:/...') uses FTP
Initial Comment:
urllib.urlopen(), when given a 'file' URL containing a host
part, like 'file://somehost/path/to/file', treats it as if it
were an 'ftp' URL.
While RFC 1738 acknowledges that the access method
for file URLs is unspecified, the assumption of FTP, even
when a direct access method is available, is a poor
design decision and is a possible security risk in
applications that use urlopen().
When given a file URL, urlopen() should extract the
portion following 'file:', convert a leading '//localhost/'
to '///' (because localhost is a special case per RFC
1738; see other bug report on this topic), and use
url2pathname() to try to convert this to an OS-specific
path. The result can then be passed to open().
For example, on Windows, urlopen
('file://somehost/path/to/file') should return the result of
open('\somehost\path\to\file', 'rb').
In situations where there is no convention for interpreting
the host part of a URL as a component in an OS path,
such as on Unix filesystems, an exception should be
raised by url2pathname(), in my opinion. If urlopen()
wants to try an alternate access method such as FTP, it
should only do so if directed by the caller.
----------------------------------------------------------------------
>Comment By: Mike Brown (mike_j_brown)
Date: 2002-12-08 15:22
Message:
Logged In: YES
user_id=371366
If you document it, it's not a bug?
The docs say that the fallback on FTP is "for backward
compatibility" ... backward compatibility with what?
The fact that it's a possible security risk should at least be
documented. An application on a machine behind a firewall
might not be expecting 'file' URLs to result in hitting the FTP
servers of that machine or its neighbors.
----------------------------------------------------------------------
Comment By: Martin v. Löwis (loewis)
Date: 2002-12-08 02:03
Message:
Logged In: YES
user_id=21627
This is not a bug; it is documented behaviour: see
http://python.org/doc/current/lib/module-urllib.html
To override this behaviour, use urllib2, and inherit from
FileHandler.
LINK
when we talk about firewalls are we talking about frames in python?