RFR(S): 8188122: Path length limits on Windows leads to obscure class loading failures

classic Classic list List threaded Threaded
18 messages Options
Reply | Threaded
Open this post in threaded view
|

RFR(S): 8188122: Path length limits on Windows leads to obscure class loading failures

Calvin Cheung
JBS: https://bugs.openjdk.java.net/browse/JDK-8188122

Adding a warning message if the full path or the directory length
exceeds MAX_PATH on windows.

Example warning messages.

1) The full path exceeds MAX_PATH:

Java HotSpot(TM) 64-Bit Server VM warning: construct full path name
failed: total length 270 exceeds max length of 260
     dir
T:\\testoutput\\jtreg\\JTwork\\classes\\2\\runtime\\LoadClass\\LongPath.d\\xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
length 259
     name Hello.class length 11

2) The directory path exceeds MAX_PATH:

Java HotSpot(TM) 64-Bit Server VM warning: os::stat failed: path length
265 exceeds max length 260
     path
T:\\testoutput\\jtreg\\JTwork\\classes\\2\\runtime\\LoadClass\\LongPath.d\\xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx\\xxxxx

webrev:
     http://cr.openjdk.java.net/~ccheung/8188122/webrev.00/

Testing:
     JPRT (including the new test)

thanks,
Calvin
Reply | Threaded
Open this post in threaded view
|

Re: RFR(S): 8188122: Path length limits on Windows leads to obscure class loading failures

David Holmes
Hi Calvin,

Isn't there a Unicode API we can use for this?

"The Windows API has many functions that also have Unicode versions to
permit an extended-length path for a maximum total path length of 32,767
characters. " [1]

Thanks,
David

[1]
https://msdn.microsoft.com/en-us/library/windows/desktop/aa365247(v=vs.85).aspx#maxpath

On 17/10/2017 8:15 AM, Calvin Cheung wrote:

> JBS: https://bugs.openjdk.java.net/browse/JDK-8188122
>
> Adding a warning message if the full path or the directory length
> exceeds MAX_PATH on windows.
>
> Example warning messages.
>
> 1) The full path exceeds MAX_PATH:
>
> Java HotSpot(TM) 64-Bit Server VM warning: construct full path name
> failed: total length 270 exceeds max length of 260
>      dir
> T:\\testoutput\\jtreg\\JTwork\\classes\\2\\runtime\\LoadClass\\LongPath.d\\xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
> length 259
>      name Hello.class length 11
>
> 2) The directory path exceeds MAX_PATH:
>
> Java HotSpot(TM) 64-Bit Server VM warning: os::stat failed: path length
> 265 exceeds max length 260
>      path
> T:\\testoutput\\jtreg\\JTwork\\classes\\2\\runtime\\LoadClass\\LongPath.d\\xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx\\xxxxx
>
>
> webrev:
>      http://cr.openjdk.java.net/~ccheung/8188122/webrev.00/
>
> Testing:
>      JPRT (including the new test)
>
> thanks,
> Calvin
Reply | Threaded
Open this post in threaded view
|

Re: RFR(S): 8188122: Path length limits on Windows leads to obscure class loading failures

Ioi Lam
In reply to this post by Calvin Cheung
Hi Calvin,

 From your test case:

   44     private static final int MAX_PATH = 260;
   ...

   53         int subDirLen = MAX_PATH - classDir.toString().length() - 2;
   54         if (subDirLen > 0) {
   55             char[] chars = new char[subDirLen];
   56             Arrays.fill(chars, 'x');
   57             String subPath = new String(chars);
   58             destDir =
Paths.get(System.getProperty("test.classes"), subPath);
   59         }
   60
   61         CompilerUtils.compile(sourceDir, destDir);

It looks like at least part of the JDK is able to read and write files
that are in paths longer than 260 characters.

For JDK-8188122, it seems the problem exists only with -Xbootclasspath/a
is used. Have you tested if you can use -cp to load classes from the
long directory?

If that's the case, maybe we should fix -Xbootclasspath/a so that it can
handle over 260 chars on Windows?

Thanks

- Ioi



On 10/16/17 3:15 PM, Calvin Cheung wrote:

> JBS: https://bugs.openjdk.java.net/browse/JDK-8188122
>
> Adding a warning message if the full path or the directory length
> exceeds MAX_PATH on windows.
>
> Example warning messages.
>
> 1) The full path exceeds MAX_PATH:
>
> Java HotSpot(TM) 64-Bit Server VM warning: construct full path name
> failed: total length 270 exceeds max length of 260
>     dir
> T:\\testoutput\\jtreg\\JTwork\\classes\\2\\runtime\\LoadClass\\LongPath.d\\xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
> length 259
>     name Hello.class length 11
>
> 2) The directory path exceeds MAX_PATH:
>
> Java HotSpot(TM) 64-Bit Server VM warning: os::stat failed: path
> length 265 exceeds max length 260
>     path
> T:\\testoutput\\jtreg\\JTwork\\classes\\2\\runtime\\LoadClass\\LongPath.d\\xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx\\xxxxx
>
> webrev:
>     http://cr.openjdk.java.net/~ccheung/8188122/webrev.00/
>
> Testing:
>     JPRT (including the new test)
>
> thanks,
> Calvin

Reply | Threaded
Open this post in threaded view
|

Re: RFR(S): 8188122: Path length limits on Windows leads to obscure class loading failures

Calvin Cheung
In reply to this post by David Holmes
Hi David,

In order to use the Unicode API, the hotspot windows version needs to be
compiled with UNICODE defined.
I can file an RFE for that if we want to pursue that enhancement.
For now, I think adding more informative message other than just a CNFE
should be sufficient.

What do you think?

thanks,
Calvin

On 10/16/17, 4:06 PM, David Holmes wrote:

> Hi Calvin,
>
> Isn't there a Unicode API we can use for this?
>
> "The Windows API has many functions that also have Unicode versions to
> permit an extended-length path for a maximum total path length of
> 32,767 characters. " [1]
>
> Thanks,
> David
>
> [1]
> https://msdn.microsoft.com/en-us/library/windows/desktop/aa365247(v=vs.85).aspx#maxpath
>
> On 17/10/2017 8:15 AM, Calvin Cheung wrote:
>> JBS: https://bugs.openjdk.java.net/browse/JDK-8188122
>>
>> Adding a warning message if the full path or the directory length
>> exceeds MAX_PATH on windows.
>>
>> Example warning messages.
>>
>> 1) The full path exceeds MAX_PATH:
>>
>> Java HotSpot(TM) 64-Bit Server VM warning: construct full path name
>> failed: total length 270 exceeds max length of 260
>>      dir
>> T:\\testoutput\\jtreg\\JTwork\\classes\\2\\runtime\\LoadClass\\LongPath.d\\xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
>> length 259
>>      name Hello.class length 11
>>
>> 2) The directory path exceeds MAX_PATH:
>>
>> Java HotSpot(TM) 64-Bit Server VM warning: os::stat failed: path
>> length 265 exceeds max length 260
>>      path
>> T:\\testoutput\\jtreg\\JTwork\\classes\\2\\runtime\\LoadClass\\LongPath.d\\xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx\\xxxxx
>>
>>
>> webrev:
>>      http://cr.openjdk.java.net/~ccheung/8188122/webrev.00/
>>
>> Testing:
>>      JPRT (including the new test)
>>
>> thanks,
>> Calvin
Reply | Threaded
Open this post in threaded view
|

Re: RFR(S): 8188122: Path length limits on Windows leads to obscure class loading failures

Calvin Cheung
In reply to this post by Ioi Lam
Hi Ioi,

On 10/16/17, 4:15 PM, Ioi Lam wrote:

> Hi Calvin,
>
> From your test case:
>
>   44     private static final int MAX_PATH = 260;
>   ...
>
>   53         int subDirLen = MAX_PATH - classDir.toString().length() - 2;
>   54         if (subDirLen > 0) {
>   55             char[] chars = new char[subDirLen];
>   56             Arrays.fill(chars, 'x');
>   57             String subPath = new String(chars);
>   58             destDir =
> Paths.get(System.getProperty("test.classes"), subPath);
>   59         }
>   60
>   61         CompilerUtils.compile(sourceDir, destDir);
>
> It looks like at least part of the JDK is able to read and write files
> that are in paths longer than 260 characters.
>
> For JDK-8188122, it seems the problem exists only with
> -Xbootclasspath/a is used. Have you tested if you can use -cp to load
> classes from the long directory?
Yes, the problem exists only with -Xbootclasspath/a because the call
path is different comparing with -cp.

Call path with -Xbootclasspath/a:
     ClassPathDirEntry::open_stream() at classLoader.cpp:274 0x7ffff5ca48d0
     ClassLoader::load_class() at classLoader.cpp:1,412 0x7ffff5ca94a8
     SystemDictionary::load_instance_class() at
systemDictionary.cpp:1,524 0x7ffff671e141
     SystemDictionary::resolve_instance_class_or_null() at
systemDictionary.cpp:847 0x7ffff671bb3f
     SystemDictionary::resolve_or_null() at systemDictionary.cpp:263
0x7ffff671a1a1
     SystemDictionary::resolve_or_null() at systemDictionary.cpp:268
0x7ffff671a215
     JVM_FindClassFromBootLoader() at jvm.cpp:777 0x7ffff6204544
     Java_java_lang_ClassLoader_findBootstrapClass() at 0x7ffff4a8a0e6

Call path with -cp:

     ClassFileParser::fill_instance_klass() at classFileParser.cpp:5,309
0x7ffff5c93a6d
     ClassFileParser::create_instance_klass() at
classFileParser.cpp:5,289 0x7ffff5c938bd
     KlassFactory::create_from_stream() at klassFactory.cpp:214
0x7ffff63462ef
     SystemDictionary::resolve_from_stream() at
systemDictionary.cpp:1,144 0x7ffff671cadd
     jvm_define_class_common() at jvm.cpp:943 0x7ffff6205376
     JVM_DefineClassWithSource() at jvm.cpp:963 0x7ffff620588d
     Java_java_lang_ClassLoader_defineClass1() at 0x7ffff4a89c68

Note that ClassPathDirEntry::open_stream() isn't called when -cp is used.
>
> If that's the case, maybe we should fix -Xbootclasspath/a so that it
> can handle over 260 chars on Windows?
No easy way of fixing it other than using the unicode version of windows
API.

thanks,
Calvin

>
> Thanks
>
> - Ioi
>
>
>
> On 10/16/17 3:15 PM, Calvin Cheung wrote:
>> JBS: https://bugs.openjdk.java.net/browse/JDK-8188122
>>
>> Adding a warning message if the full path or the directory length
>> exceeds MAX_PATH on windows.
>>
>> Example warning messages.
>>
>> 1) The full path exceeds MAX_PATH:
>>
>> Java HotSpot(TM) 64-Bit Server VM warning: construct full path name
>> failed: total length 270 exceeds max length of 260
>>     dir
>> T:\\testoutput\\jtreg\\JTwork\\classes\\2\\runtime\\LoadClass\\LongPath.d\\xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
>> length 259
>>     name Hello.class length 11
>>
>> 2) The directory path exceeds MAX_PATH:
>>
>> Java HotSpot(TM) 64-Bit Server VM warning: os::stat failed: path
>> length 265 exceeds max length 260
>>     path
>> T:\\testoutput\\jtreg\\JTwork\\classes\\2\\runtime\\LoadClass\\LongPath.d\\xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx\\xxxxx
>>
>> webrev:
>>     http://cr.openjdk.java.net/~ccheung/8188122/webrev.00/
>>
>> Testing:
>>     JPRT (including the new test)
>>
>> thanks,
>> Calvin
>
Reply | Threaded
Open this post in threaded view
|

Re: RFR(S): 8188122: Path length limits on Windows leads to obscure class loading failures

David Holmes
On 17/10/2017 10:47 AM, Calvin Cheung wrote:

> Hi Ioi,
>
> On 10/16/17, 4:15 PM, Ioi Lam wrote:
>> Hi Calvin,
>>
>> From your test case:
>>
>>   44     private static final int MAX_PATH = 260;
>>   ...
>>
>>   53         int subDirLen = MAX_PATH - classDir.toString().length() - 2;
>>   54         if (subDirLen > 0) {
>>   55             char[] chars = new char[subDirLen];
>>   56             Arrays.fill(chars, 'x');
>>   57             String subPath = new String(chars);
>>   58             destDir =
>> Paths.get(System.getProperty("test.classes"), subPath);
>>   59         }
>>   60
>>   61         CompilerUtils.compile(sourceDir, destDir);
>>
>> It looks like at least part of the JDK is able to read and write files
>> that are in paths longer than 260 characters.
>>
>> For JDK-8188122, it seems the problem exists only with
>> -Xbootclasspath/a is used. Have you tested if you can use -cp to load
>> classes from the long directory?
> Yes, the problem exists only with -Xbootclasspath/a because the call
> path is different comparing with -cp.

The point is if -cp can handle it then -Xbootclasspath must be able to
handle it too! They both have to read directories!

Thanks,
David

> Call path with -Xbootclasspath/a:
>      ClassPathDirEntry::open_stream() at classLoader.cpp:274 0x7ffff5ca48d0
>      ClassLoader::load_class() at classLoader.cpp:1,412 0x7ffff5ca94a8
>      SystemDictionary::load_instance_class() at
> systemDictionary.cpp:1,524 0x7ffff671e141
>      SystemDictionary::resolve_instance_class_or_null() at
> systemDictionary.cpp:847 0x7ffff671bb3f
>      SystemDictionary::resolve_or_null() at systemDictionary.cpp:263
> 0x7ffff671a1a1
>      SystemDictionary::resolve_or_null() at systemDictionary.cpp:268
> 0x7ffff671a215
>      JVM_FindClassFromBootLoader() at jvm.cpp:777 0x7ffff6204544
>      Java_java_lang_ClassLoader_findBootstrapClass() at 0x7ffff4a8a0e6
>
> Call path with -cp:
>
>      ClassFileParser::fill_instance_klass() at classFileParser.cpp:5,309
> 0x7ffff5c93a6d
>      ClassFileParser::create_instance_klass() at
> classFileParser.cpp:5,289 0x7ffff5c938bd
>      KlassFactory::create_from_stream() at klassFactory.cpp:214
> 0x7ffff63462ef
>      SystemDictionary::resolve_from_stream() at
> systemDictionary.cpp:1,144 0x7ffff671cadd
>      jvm_define_class_common() at jvm.cpp:943 0x7ffff6205376
>      JVM_DefineClassWithSource() at jvm.cpp:963 0x7ffff620588d
>      Java_java_lang_ClassLoader_defineClass1() at 0x7ffff4a89c68
>
> Note that ClassPathDirEntry::open_stream() isn't called when -cp is used.
>>
>> If that's the case, maybe we should fix -Xbootclasspath/a so that it
>> can handle over 260 chars on Windows?
> No easy way of fixing it other than using the unicode version of windows
> API.
>
> thanks,
> Calvin
>>
>> Thanks
>>
>> - Ioi
>>
>>
>>
>> On 10/16/17 3:15 PM, Calvin Cheung wrote:
>>> JBS: https://bugs.openjdk.java.net/browse/JDK-8188122
>>>
>>> Adding a warning message if the full path or the directory length
>>> exceeds MAX_PATH on windows.
>>>
>>> Example warning messages.
>>>
>>> 1) The full path exceeds MAX_PATH:
>>>
>>> Java HotSpot(TM) 64-Bit Server VM warning: construct full path name
>>> failed: total length 270 exceeds max length of 260
>>>     dir
>>> T:\\testoutput\\jtreg\\JTwork\\classes\\2\\runtime\\LoadClass\\LongPath.d\\xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
>>> length 259
>>>     name Hello.class length 11
>>>
>>> 2) The directory path exceeds MAX_PATH:
>>>
>>> Java HotSpot(TM) 64-Bit Server VM warning: os::stat failed: path
>>> length 265 exceeds max length 260
>>>     path
>>> T:\\testoutput\\jtreg\\JTwork\\classes\\2\\runtime\\LoadClass\\LongPath.d\\xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx\\xxxxx
>>>
>>>
>>> webrev:
>>>     http://cr.openjdk.java.net/~ccheung/8188122/webrev.00/
>>>
>>> Testing:
>>>     JPRT (including the new test)
>>>
>>> thanks,
>>> Calvin
>>
Reply | Threaded
Open this post in threaded view
|

Re: RFR(S): 8188122: Path length limits on Windows leads to obscure class loading failures

Calvin Cheung
In reply to this post by Calvin Cheung
I've reworked this fix by using the Unicode version of open (wopen) to
handle path name longer than max path with the path prefixed to indicate
an extended length path as described in [1].

The Unicode version of stat (wstat) doesn't work well with long path
[2]. So FindFirstFileW will be used.The data in WIN32_FIND_DATA returned
from FindFirstFileW needs to be transferred to the stat structure since
the caller expects a return stat structure and other platforms return a
stat structure.

In classLoader.cpp, calculate the size of buffer required instead of
limiting it to JVM_MAXPATHLEN.
In os_windows.cpp, dynamically allocate buffers in os::open and os::stat.

updated webrev: http://cr.openjdk.java.net/~ccheung/8188122/webrev.01/

It passed hs-tier2 testing using mach5.

thanks,
Calvin

[1]
https://msdn.microsoft.com/en-us/library/windows/desktop/aa365247(v=vs.85).aspx#MAX_PATH 

[2]
https://social.msdn.microsoft.com/Forums/vstudio/en-US/3c093ea9-f0aa-446d-b648-2dabe8480430/stat-and-long-names?forum=vcgeneral

On 10/16/17, 3:15 PM, Calvin Cheung wrote:

> JBS: https://bugs.openjdk.java.net/browse/JDK-8188122
>
> Adding a warning message if the full path or the directory length
> exceeds MAX_PATH on windows.
>
> Example warning messages.
>
> 1) The full path exceeds MAX_PATH:
>
> Java HotSpot(TM) 64-Bit Server VM warning: construct full path name
> failed: total length 270 exceeds max length of 260
>     dir
> T:\\testoutput\\jtreg\\JTwork\\classes\\2\\runtime\\LoadClass\\LongPath.d\\xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
> length 259
>     name Hello.class length 11
>
> 2) The directory path exceeds MAX_PATH:
>
> Java HotSpot(TM) 64-Bit Server VM warning: os::stat failed: path
> length 265 exceeds max length 260
>     path
> T:\\testoutput\\jtreg\\JTwork\\classes\\2\\runtime\\LoadClass\\LongPath.d\\xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx\\xxxxx
>
> webrev:
>     http://cr.openjdk.java.net/~ccheung/8188122/webrev.00/
>
> Testing:
>     JPRT (including the new test)
>
> thanks,
> Calvin
Reply | Threaded
Open this post in threaded view
|

Re: RFR(S): 8188122: Path length limits on Windows leads to obscure class loading failures

Thomas Stüfe-2
Hi Calvin,

this is a worthwhile addition, thank you for your work!

=========================

src/hotspot/os/windows/os_windows.cpp

Could you please use wchar_t instead of WCHAR?

---------------
in os::stat():

This cumbersome malloc/strcpy sequence:

!   size_t path_len = strlen(path) + 1;
+   char* pathbuf = (char*)os::malloc(path_len * sizeof(char), mtInternal);
+   if (pathbuf == NULL) {
+     errno = ENOMEM;
      return -1;
    }
    os::native_path(strcpy(pathbuf, path));

 can be reduced to a simple strdup:

char* pathbuf = os::strdup(path, mtInternal);
if (pathbuf == NULL) {
  errno = ENOMEM;
  return -1;
}
os::native_path(pathbuf);

This also would separate strcpy() from os::native_path() which is a bit
unreadable.

--------------------

The single-byte path (strdup, ::stat()), together with its free(), can be
moved inside the (strlen(path) < MAX_PATH) condition. os::native_path will
not change the path length (it better not, as it operates on the input
buffer). That avoids  having two allocations on the doublebyte path.

-----------------------

But seeing that translation to UNC paths is something we might want more
often, I would combine allocation and UNC prefix adding to one function
like this, to avoid the memove and increase reusability:

// Creates an unc path from a single byte path. Return buffer is allocated
in C heap and needs to be freed by caller. Returns NULL on error.
static whchar_t* create_unc_path(const char* s) {
  - does s start with \\?\ ?
- yes:
            - os::malloc(strlen(s) + 1) and mbstowcs_s
        - no:
            - os::malloc(strlen(s) + 1 + 4), mbstowcs_s to fourth position
in string, prefix with L"\\?\"
}

We also need error checking to mbstowcs_s.

Just for cleanliness, I would then wrap deallocation into an own function.

static viud destroy_unc_path(whchar_t* s) { os::free(s); }

These two functions could be candidates of putting into os::win32
namespace, as this form of ANSI->UNC path translation is quite common -
whoever wants to use the xxxxW() functions must do this.

-----------------------

FindFirstFileW:

I am pretty sure that you can achieve the same result with
GetFileAttributesExW(). It also returns WIN32_FIND_DATAW.

It is way more straightforward to use than FindFirstFileW, as it does not
require you to write a callback hook.

-------

eval_find_data(): This is more of a generic helper function, could you
rename this to something clearer, e.g. make_double_word() ?

Also, setup_stat(), could this be renamed more clearly to something like
WIN32_FIND_DATA_to_stat? or lowercase if this bothers you :)

==================================
src/hotspot/share/classfile/classLoader.cpp

In ClassPathDirEntry::open_stream(), I would feel better if we were
asserting _dir and name to be != NULL before feeding it to strlen.

===================

Thanks!

Thomas






On Wed, Oct 25, 2017 at 9:48 PM, Calvin Cheung <[hidden email]>
wrote:

> I've reworked this fix by using the Unicode version of open (wopen) to
> handle path name longer than max path with the path prefixed to indicate an
> extended length path as described in [1].
>
> The Unicode version of stat (wstat) doesn't work well with long path [2].
> So FindFirstFileW will be used.The data in WIN32_FIND_DATA returned from
> FindFirstFileW needs to be transferred to the stat structure since the
> caller expects a return stat structure and other platforms return a stat
> structure.
>
> In classLoader.cpp, calculate the size of buffer required instead of
> limiting it to JVM_MAXPATHLEN.
> In os_windows.cpp, dynamically allocate buffers in os::open and os::stat.
>
> updated webrev: http://cr.openjdk.java.net/~ccheung/8188122/webrev.01/
>
> It passed hs-tier2 testing using mach5.
>
> thanks,
> Calvin
>
> [1] https://msdn.microsoft.com/en-us/library/windows/desktop/aa3
> 65247(v=vs.85).aspx#MAX_PATH
> [2] https://social.msdn.microsoft.com/Forums/vstudio/en-US/3c093
> ea9-f0aa-446d-b648-2dabe8480430/stat-and-long-names?forum=vcgeneral
>
>
> On 10/16/17, 3:15 PM, Calvin Cheung wrote:
>
>> JBS: https://bugs.openjdk.java.net/browse/JDK-8188122
>>
>> Adding a warning message if the full path or the directory length exceeds
>> MAX_PATH on windows.
>>
>> Example warning messages.
>>
>> 1) The full path exceeds MAX_PATH:
>>
>> Java HotSpot(TM) 64-Bit Server VM warning: construct full path name
>> failed: total length 270 exceeds max length of 260
>>     dir T:\\testoutput\\jtreg\\JTwork\\classes\\2\\runtime\\LoadClas
>> s\\LongPath.d\\xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
>> xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
>> xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
>> length 259
>>     name Hello.class length 11
>>
>> 2) The directory path exceeds MAX_PATH:
>>
>> Java HotSpot(TM) 64-Bit Server VM warning: os::stat failed: path length
>> 265 exceeds max length 260
>>     path T:\\testoutput\\jtreg\\JTwork\\classes\\2\\runtime\\LoadClas
>> s\\LongPath.d\\xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
>> xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
>> xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
>> xxxxxxxxxxxxxxxxxxxxxxxxxxxx\\xxxxx
>>
>> webrev:
>>     http://cr.openjdk.java.net/~ccheung/8188122/webrev.00/
>>
>> Testing:
>>     JPRT (including the new test)
>>
>> thanks,
>> Calvin
>>
>
Reply | Threaded
Open this post in threaded view
|

Re: RFR(S): 8188122: Path length limits on Windows leads to obscure class loading failures

Calvin Cheung
Hi Thomas,

On 10/25/17, 11:54 PM, Thomas Stüfe wrote:
> Hi Calvin,
>
> this is a worthwhile addition, thank you for your work!
Thanks for your review.
>
> =========================
>
> src/hotspot/os/windows/os_windows.cpp
>
> Could you please use wchar_t instead of WCHAR?
Fixed.

>
> ---------------
> in os::stat():
>
> This cumbersome malloc/strcpy sequence:
>
> !   size_t path_len = strlen(path) + 1;
> +   char* pathbuf = (char*)os::malloc(path_len * sizeof(char),
> mtInternal);
> +   if (pathbuf == NULL) {
> +     errno = ENOMEM;
>       return -1;
>     }
>     os::native_path(strcpy(pathbuf, path));
>
>  can be reduced to a simple strdup:
>
> char* pathbuf = os::strdup(path, mtInternal);
> if (pathbuf == NULL) {
>   errno = ENOMEM;
>   return -1;
> }
> os::native_path(pathbuf);
>
> This also would separate strcpy() from os::native_path() which is a
> bit unreadable.
I've made the change.
>
> --------------------
>
> The single-byte path (strdup, ::stat()), together with its free(), can
> be moved inside the (strlen(path) < MAX_PATH) condition.
> os::native_path will not change the path length (it better not, as it
> operates on the input buffer). That avoids  having two allocations on
> the doublebyte path.

os::native_path() converts a path like C:\\\\somedir to C:\\somedir
So I'll need the converted path in the wchar_t case too. The freeing of
the pathbuf is being done at the end of the function or in the middle of
the wchar_t case if there's an error.

>
> -----------------------
>
> But seeing that translation to UNC paths is something we might want
> more often, I would combine allocation and UNC prefix adding to one
> function like this, to avoid the memove and increase reusability:
>
> // Creates an unc path from a single byte path. Return buffer is
> allocated in C heap and needs to be freed by caller. Returns NULL on
> error.
> static whchar_t* create_unc_path(const char* s) {
>   - does s start with \\?\ ?
> - yes:
>             - os::malloc(strlen(s) + 1) and mbstowcs_s
>         - no:
>             - os::malloc(strlen(s) + 1 + 4), mbstowcs_s to fourth
> position in string, prefix with L"\\?\"
> }

I also include the case for adding  L"\\\\?\\UNC\0" at the beginning to
be consistent with libjava/canonicalize_md.c.
>
> We also need error checking to mbstowcs_s.
I've added assert like the following after the call:

assert(converted_chars == path_len, "sanity");
>
> Just for cleanliness, I would then wrap deallocation into an own function.
>
> static viud destroy_unc_path(whchar_t* s) { os::free(s); }
I've added the destroy function.
>
> These two functions could be candidates of putting into os::win32
> namespace, as this form of ANSI->UNC path translation is quite common
> - whoever wants to use the xxxxW() functions must do this.
I'm leaving them in os_windows.cpp since they're being used only within
that file.
>
> -----------------------
>
> FindFirstFileW:
>
> I am pretty sure that you can achieve the same result with
> GetFileAttributesExW(). It also returns WIN32_FIND_DATAW.
It actually returns WIN32_FILE_ATTRIBUTE_DATA and is very similar to
WIN32_FIND_DATAW.
>
> It is way more straightforward to use than FindFirstFileW, as it does
> not require you to write a callback hook.
I've switched to using GetFileAttributesExW().
>
> -------
>
> eval_find_data(): This is more of a generic helper function, could you
> rename this to something clearer, e.g. make_double_word() ?
Ok. I've renamed it.
>
> Also, setup_stat(), could this be renamed more clearly to something
> like WIN32_FIND_DATA_to_stat? or lowercase if this bothers you :)
I'm naming the function as file_attribute_data_to_stat() to match with
the data structure name.
>
> ==================================
> src/hotspot/share/classfile/classLoader.cpp
>
> In ClassPathDirEntry::open_stream(), I would feel better if we were
> asserting _dir and name to be != NULL before feeding it to strlen.
I've added an assert statement.
>
> ===================
Here's an updated webrev:
     http://cr.openjdk.java.net/~ccheung/8188122/webrev.02/

thanks,
Calvin

>
> Thanks!
>
> Thomas
>
>
>
>
>
>
> On Wed, Oct 25, 2017 at 9:48 PM, Calvin Cheung
> <[hidden email] <mailto:[hidden email]>> wrote:
>
>     I've reworked this fix by using the Unicode version of open
>     (wopen) to handle path name longer than max path with the path
>     prefixed to indicate an extended length path as described in [1].
>
>     The Unicode version of stat (wstat) doesn't work well with long
>     path [2]. So FindFirstFileW will be used.The data in
>     WIN32_FIND_DATA returned from FindFirstFileW needs to be
>     transferred to the stat structure since the caller expects a
>     return stat structure and other platforms return a stat structure.
>
>     In classLoader.cpp, calculate the size of buffer required instead
>     of limiting it to JVM_MAXPATHLEN.
>     In os_windows.cpp, dynamically allocate buffers in os::open and
>     os::stat.
>
>     updated webrev:
>     http://cr.openjdk.java.net/~ccheung/8188122/webrev.01/
>     <http://cr.openjdk.java.net/%7Eccheung/8188122/webrev.01/>
>
>     It passed hs-tier2 testing using mach5.
>
>     thanks,
>     Calvin
>
>     [1]
>     https://msdn.microsoft.com/en-us/library/windows/desktop/aa365247(v=vs.85).aspx#MAX_PATH
>     <https://msdn.microsoft.com/en-us/library/windows/desktop/aa365247%28v=vs.85%29.aspx#MAX_PATH>
>
>     [2]
>     https://social.msdn.microsoft.com/Forums/vstudio/en-US/3c093ea9-f0aa-446d-b648-2dabe8480430/stat-and-long-names?forum=vcgeneral
>     <https://social.msdn.microsoft.com/Forums/vstudio/en-US/3c093ea9-f0aa-446d-b648-2dabe8480430/stat-and-long-names?forum=vcgeneral>
>
>
>
>     On 10/16/17, 3:15 PM, Calvin Cheung wrote:
>
>         JBS: https://bugs.openjdk.java.net/browse/JDK-8188122
>         <https://bugs.openjdk.java.net/browse/JDK-8188122>
>
>         Adding a warning message if the full path or the directory
>         length exceeds MAX_PATH on windows.
>
>         Example warning messages.
>
>         1) The full path exceeds MAX_PATH:
>
>         Java HotSpot(TM) 64-Bit Server VM warning: construct full path
>         name failed: total length 270 exceeds max length of 260
>             dir
>         T:\\testoutput\\jtreg\\JTwork\\classes\\2\\runtime\\LoadClass\\LongPath.d\\xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
>         length 259
>             name Hello.class length 11
>
>         2) The directory path exceeds MAX_PATH:
>
>         Java HotSpot(TM) 64-Bit Server VM warning: os::stat failed:
>         path length 265 exceeds max length 260
>             path
>         T:\\testoutput\\jtreg\\JTwork\\classes\\2\\runtime\\LoadClass\\LongPath.d\\xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx\\xxxxx
>
>         webrev:
>         http://cr.openjdk.java.net/~ccheung/8188122/webrev.00/
>         <http://cr.openjdk.java.net/%7Eccheung/8188122/webrev.00/>
>
>         Testing:
>             JPRT (including the new test)
>
>         thanks,
>         Calvin
>
>
Reply | Threaded
Open this post in threaded view
|

Re: RFR(S): 8188122: Path length limits on Windows leads to obscure class loading failures

Thomas Stüfe-2
Hi Calvin,

On Fri, Oct 27, 2017 at 2:03 AM, Calvin Cheung <[hidden email]>
wrote:

> Hi Thomas,
>
> On 10/25/17, 11:54 PM, Thomas Stüfe wrote:
>
>> Hi Calvin,
>>
>> this is a worthwhile addition, thank you for your work!
>>
> Thanks for your review.


Thanks for your work :)

First off, there is another issue in file_attribute_data_to_stat(). We
actually had this issue before, but your work uncovered it:

According to POSIX (
http://pubs.opengroup.org/onlinepubs/009695399/basedefs/sys/stat.h.html)
and every unix manpage I looked at (solaris, linux, aix..), st_ctime is not
the file creation time but the last time file status was changed. Only
Microsoft gets this wrong in their posix layer, its stat() function returns
 (https://msdn.microsoft.com/en-us/library/14h5k7ff.aspx) creation time in
st_ctime.

But as os::stat() is a platform independent layer, I'd say we should behave
the same on all platforms, and that would be the Posix way.

I did not find any code calling os::stat() and using st_ctime, so this is
still save to change for windows. (Note that I found that
perfMemory_xxx.cpp uses plain OS ::stat and misuses ctime as "creation
time" - I opened a bug for that (
https://bugs.openjdk.java.net/browse/JDK-8190260 - but that does not affect
us, as they do not call os::stat()).

There is one more complication: in os::stat() we use plain ::stat() in the
single byte case.:

*+ if (strlen(path) < MAX_PATH) {*

*+     ret = ::stat(pathbuf, sbuf);**+   } else {*


But ::stat() on Windows is broken, as I wrote above. We should not use it,
especially not if we change the meaing of st_ctime in the double byte path.

My suggestion would be to just always call GetFileAttributes(), also for
the single byte path. A very simple solution would be to just always go the
double byte path with UNC translation, regardless of the path length. Or,
if you are worried about performance, for paths shorter than MAX_PATH we
use GetFileAttributesA(), for longer paths GetFileAttributesW with UNC
translation. In both cases you get a WIN32_FILE_ATTRIBUTE_DATA which you
can feed tp your  file_attribute_data_to_stat() routine and have the struct
stat you return be always consistent.


But onward:


>
>> =========================
>>
>> src/hotspot/os/windows/os_windows.cpp
>>
>> Could you please use wchar_t instead of WCHAR?
>>
> Fixed.
>
>>
>> ---------------
>> in os::stat():
>>
>> This cumbersome malloc/strcpy sequence:
>>
>> !   size_t path_len = strlen(path) + 1;
>> +   char* pathbuf = (char*)os::malloc(path_len * sizeof(char),
>> mtInternal);
>> +   if (pathbuf == NULL) {
>> +     errno = ENOMEM;
>>       return -1;
>>     }
>>     os::native_path(strcpy(pathbuf, path));
>>
>>  can be reduced to a simple strdup:
>>
>> char* pathbuf = os::strdup(path, mtInternal);
>> if (pathbuf == NULL) {
>>   errno = ENOMEM;
>>   return -1;
>> }
>> os::native_path(pathbuf);
>>
>> This also would separate strcpy() from os::native_path() which is a bit
>> unreadable.
>>
> I've made the change.
>
>>
>> --------------------
>>
>> The single-byte path (strdup, ::stat()), together with its free(), can be
>> moved inside the (strlen(path) < MAX_PATH) condition. os::native_path will
>> not change the path length (it better not, as it operates on the input
>> buffer). That avoids  having two allocations on the doublebyte path.
>>
>
> os::native_path() converts a path like C:\\\\somedir to C:\\somedir
> So I'll need the converted path in the wchar_t case too. The freeing of
> the pathbuf is being done at the end of the function or in the middle of
> the wchar_t case if there's an error.


Oh, you are right,of course. I missed that it this is needed for both paths.


>
>
>> -----------------------
>>
>> But seeing that translation to UNC paths is something we might want more
>> often, I would combine allocation and UNC prefix adding to one function
>> like this, to avoid the memove and increase reusability:
>>
>> // Creates an unc path from a single byte path. Return buffer is
>> allocated in C heap and needs to be freed by caller. Returns NULL on error.
>> static whchar_t* create_unc_path(const char* s) {
>>   - does s start with \\?\ ?
>> - yes:
>>             - os::malloc(strlen(s) + 1) and mbstowcs_s
>>         - no:
>>             - os::malloc(strlen(s) + 1 + 4), mbstowcs_s to fourth
>> position in string, prefix with L"\\?\"
>> }
>>
>
> I also include the case for adding  L"\\\\?\\UNC\0" at the beginning to be
> consistent with libjava/canonicalize_md.c.


>> We also need error checking to mbstowcs_s.
>>
> I've added assert like the following after the call:
>
> assert(converted_chars == path_len, "sanity");



create_unc_path() :

- could you convert the /**/ to // comments, please?

- not sure about the assert after mbstowcs_s: if we happen to encounter an
invalid multibyte character, function will fail and return an error:

https://msdn.microsoft.com/en-us/library/eyktyxsx.aspx
"If mbstowcs_s encounters an invalid multibyte character, it puts 0 in
*``pReturnValue, sets the destination buffer to an empty string, sets errno
to EILSEQ, and returns EILSEQ."

 As this is dependent on user data, we should not assert, but handle the
return code of mbstowcs_s gracefully. Same goes for
libjava/canonicalize_md.c.

- Here: ::mbstowcs_s(&converted_chars, &wpath[7], path_len + 7, path,
path_len);
third parameter is wrong, as we hand in an offset into the buffer, we must
decrement the buffer size by this offset, so correct would be path_len +7 -
7 or just path_len.

- Same error below: + ::mbstowcs_s(&converted_chars, &wpath[4], path_len +
4, path, path_len);



>
>
>> Just for cleanliness, I would then wrap deallocation into an own function.
>>
>> static viud destroy_unc_path(whchar_t* s) { os::free(s); }
>>
> I've added the destroy function.
>
>>
>> These two functions could be candidates of putting into os::win32
>> namespace, as this form of ANSI->UNC path translation is quite common -
>> whoever wants to use the xxxxW() functions must do this.
>>
> I'm leaving them in os_windows.cpp since they're being used only within
> that file.


Fine by me.


>
>
>> -----------------------
>>
>> FindFirstFileW:
>>
>> I am pretty sure that you can achieve the same result with
>> GetFileAttributesExW(). It also returns WIN32_FIND_DATAW.
>>
> It actually returns WIN32_FILE_ATTRIBUTE_DATA and is very similar to
> WIN32_FIND_DATAW.
>
>>
>> It is way more straightforward to use than FindFirstFileW, as it does not
>> require you to write a callback hook.
>>
> I've switched to using GetFileAttributesExW().


Thank you, this is way more readable.

Another issue: do we still need the fix for 6539723 if we switch from
stat() to GetFileAttributesExW()? This fix looks important, but the comment
indicates that it could break things if the original bug is not present.

Btw, this is another strong argument for scrapping ::stat() altogether on
all code paths, not only for long input paths, because stat() and
GetFileAttributesExW() may behave
differently. So it would be good to use the same API on all code paths, in
order to get the best test coverage.


>
>> -------
>>
>> eval_find_data(): This is more of a generic helper function, could you
>> rename this to something clearer, e.g. make_double_word() ?
>>
> Ok. I've renamed it.
>
>>
>> Also, setup_stat(), could this be renamed more clearly to something like
>> WIN32_FIND_DATA_to_stat? or lowercase if this bothers you :)
>>
> I'm naming the function as file_attribute_data_to_stat() to match with the
> data structure name.


Thanks for taking my suggestions.


>
>
>> ==================================
>> src/hotspot/share/classfile/classLoader.cpp
>>
>> In ClassPathDirEntry::open_stream(), I would feel better if we were
>> asserting _dir and name to be != NULL before feeding it to strlen.
>>
> I've added an assert statement.
>
>>
>> ===================
>>
> Here's an updated webrev:
>     http://cr.openjdk.java.net/~ccheung/8188122/webrev.02/
>
>
Thanks!

Thomas


> thanks,
> Calvin
>
>
>> Thanks!
>>
>> Thomas
>>
>>
>>
>>
>>
>>
>> On Wed, Oct 25, 2017 at 9:48 PM, Calvin Cheung <[hidden email]
>> <mailto:[hidden email]>> wrote:
>>
>>     I've reworked this fix by using the Unicode version of open
>>     (wopen) to handle path name longer than max path with the path
>>     prefixed to indicate an extended length path as described in [1].
>>
>>     The Unicode version of stat (wstat) doesn't work well with long
>>     path [2]. So FindFirstFileW will be used.The data in
>>     WIN32_FIND_DATA returned from FindFirstFileW needs to be
>>     transferred to the stat structure since the caller expects a
>>     return stat structure and other platforms return a stat structure.
>>
>>     In classLoader.cpp, calculate the size of buffer required instead
>>     of limiting it to JVM_MAXPATHLEN.
>>     In os_windows.cpp, dynamically allocate buffers in os::open and
>>     os::stat.
>>
>>     updated webrev:
>>     http://cr.openjdk.java.net/~ccheung/8188122/webrev.01/
>>     <http://cr.openjdk.java.net/%7Eccheung/8188122/webrev.01/>
>>
>>     It passed hs-tier2 testing using mach5.
>>
>>     thanks,
>>     Calvin
>>
>>     [1]
>>     https://msdn.microsoft.com/en-us/library/windows/desktop/aa3
>> 65247(v=vs.85).aspx#MAX_PATH
>>     <https://msdn.microsoft.com/en-us/library/windows/desktop/aa
>> 365247%28v=vs.85%29.aspx#MAX_PATH>
>>
>>     [2]
>>     https://social.msdn.microsoft.com/Forums/vstudio/en-US/3c093
>> ea9-f0aa-446d-b648-2dabe8480430/stat-and-long-names?forum=vcgeneral
>>     <https://social.msdn.microsoft.com/Forums/vstudio/en-US/3c09
>> 3ea9-f0aa-446d-b648-2dabe8480430/stat-and-long-names?forum=vcgeneral>
>>
>>
>>
>>     On 10/16/17, 3:15 PM, Calvin Cheung wrote:
>>
>>         JBS: https://bugs.openjdk.java.net/browse/JDK-8188122
>>         <https://bugs.openjdk.java.net/browse/JDK-8188122>
>>
>>         Adding a warning message if the full path or the directory
>>         length exceeds MAX_PATH on windows.
>>
>>         Example warning messages.
>>
>>         1) The full path exceeds MAX_PATH:
>>
>>         Java HotSpot(TM) 64-Bit Server VM warning: construct full path
>>         name failed: total length 270 exceeds max length of 260
>>             dir
>>         T:\\testoutput\\jtreg\\JTwork\\classes\\2\\runtime\\LoadClas
>> s\\LongPath.d\\xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
>> xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
>> xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
>> xxxxxxxxxxxxxxxxxxxxxxxxxxxx
>>         length 259
>>             name Hello.class length 11
>>
>>         2) The directory path exceeds MAX_PATH:
>>
>>         Java HotSpot(TM) 64-Bit Server VM warning: os::stat failed:
>>         path length 265 exceeds max length 260
>>             path
>>         T:\\testoutput\\jtreg\\JTwork\\classes\\2\\runtime\\LoadClas
>> s\\LongPath.d\\xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
>> xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
>> xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
>> xxxxxxxxxxxxxxxxxxxxxxxxxxxx\\xxxxx
>>
>>         webrev:
>>         http://cr.openjdk.java.net/~ccheung/8188122/webrev.00/
>>         <http://cr.openjdk.java.net/%7Eccheung/8188122/webrev.00/>
>>
>>         Testing:
>>             JPRT (including the new test)
>>
>>         thanks,
>>         Calvin
>>
>>
>>
Reply | Threaded
Open this post in threaded view
|

Re: RFR(S): 8188122: Path length limits on Windows leads to obscure class loading failures

Calvin Cheung


On 10/27/17, 12:55 AM, Thomas Stüfe wrote:

> Hi Calvin,
>
> On Fri, Oct 27, 2017 at 2:03 AM, Calvin Cheung
> <[hidden email] <mailto:[hidden email]>> wrote:
>
>     Hi Thomas,
>
>     On 10/25/17, 11:54 PM, Thomas Stüfe wrote:
>
>         Hi Calvin,
>
>         this is a worthwhile addition, thank you for your work!
>
>     Thanks for your review.
>
>
> Thanks for your work :)
>
> First off, there is another issue in file_attribute_data_to_stat(). We
> actually had this issue before, but your work uncovered it:
>
> According to POSIX
> (http://pubs.opengroup.org/onlinepubs/009695399/basedefs/sys/stat.h.html)
> and every unix manpage I looked at (solaris, linux, aix..), st_ctime
> is not the file creation time but the last time file status was
> changed. Only Microsoft gets this wrong in their posix layer, its
> stat() function returns
>  (https://msdn.microsoft.com/en-us/library/14h5k7ff.aspx) creation
> time in st_ctime.
>
> But as os::stat() is a platform independent layer, I'd say we should
> behave the same on all platforms, and that would be the Posix way.
>
> I did not find any code calling os::stat() and using st_ctime, so this
> is still save to change for windows. (Note that I found that
> perfMemory_xxx.cpp uses plain OS ::stat and misuses ctime as "creation
> time" - I opened a bug for that
> (https://bugs.openjdk.java.net/browse/JDK-8190260 - but that does not
> affect us, as they do not call os::stat()).
>
> There is one more complication: in os::stat() we use plain ::stat() in
> the single byte case.:
>
> *+ if (strlen(path) < MAX_PATH) {*
> *+     ret = ::stat(pathbuf, sbuf);*
> *+   } else {*
> *
> *
> But ::stat() on Windows is broken, as I wrote above. We should not use
> it, especially not if we change the meaing of st_ctime in the double
> byte path.
>
> My suggestion would be to just always call GetFileAttributes(), also
> for the single byte path. A very simple solution would be to just
> always go the double byte path with UNC translation, regardless of the
> path length. Or, if you are worried about performance, for paths
> shorter than MAX_PATH we use GetFileAttributesA(), for longer paths
> GetFileAttributesW with UNC translation. In both cases you get
> a WIN32_FILE_ATTRIBUTE_DATA which you can feed tp your  
> file_attribute_data_to_stat() routine and have the struct stat you
> return be always consistent.
I ran into an issue with the dwFileAttributes value for a jar file. On
Windows Server O/S, the value is set to 0x2020 which is
(FILE_ATTRIBUTE_NOT_CONTENT_INDEXED | FILE_ATTRIBUTE_ARCHIVE) but on a
desktop O/S like Windows 7, it is set to 0x0020 which is just
FILE_ATTRIBUTE_ARCHIVE. I've fixed it in file_attribute_data_to_stat().

I've also used GetTickCounts() to measure the performance of ::stat() vs
GetFileAttributesA() plus file_attribute_data_to_stat(). There's no
difference at the ms resolution. Using the high-resolution timestamp
(https://msdn.microsoft.com/en-us/library/windows/desktop/dn553408(v=vs.85).aspx),
it doesn't show much difference.

>
>
> But onward:
>
>
>
>         =========================
>
>         src/hotspot/os/windows/os_windows.cpp
>
>         Could you please use wchar_t instead of WCHAR?
>
>     Fixed.
>
>
>         ---------------
>         in os::stat():
>
>         This cumbersome malloc/strcpy sequence:
>
>         !   size_t path_len = strlen(path) + 1;
>         +   char* pathbuf = (char*)os::malloc(path_len * sizeof(char),
>         mtInternal);
>         +   if (pathbuf == NULL) {
>         +     errno = ENOMEM;
>               return -1;
>             }
>             os::native_path(strcpy(pathbuf, path));
>
>          can be reduced to a simple strdup:
>
>         char* pathbuf = os::strdup(path, mtInternal);
>         if (pathbuf == NULL) {
>           errno = ENOMEM;
>           return -1;
>         }
>         os::native_path(pathbuf);
>
>         This also would separate strcpy() from os::native_path() which
>         is a bit unreadable.
>
>     I've made the change.
>
>
>         --------------------
>
>         The single-byte path (strdup, ::stat()), together with its
>         free(), can be moved inside the (strlen(path) < MAX_PATH)
>         condition. os::native_path will not change the path length (it
>         better not, as it operates on the input buffer). That avoids
>         having two allocations on the doublebyte path.
>
>
>     os::native_path() converts a path like C:\\\\somedir to C:\\somedir
>     So I'll need the converted path in the wchar_t case too. The
>     freeing of the pathbuf is being done at the end of the function or
>     in the middle of the wchar_t case if there's an error.
>
>
> Oh, you are right,of course. I missed that it this is needed for both
> paths.
>
>
>
>         -----------------------
>
>         But seeing that translation to UNC paths is something we might
>         want more often, I would combine allocation and UNC prefix
>         adding to one function like this, to avoid the memove and
>         increase reusability:
>
>         // Creates an unc path from a single byte path. Return buffer
>         is allocated in C heap and needs to be freed by caller.
>         Returns NULL on error.
>         static whchar_t* create_unc_path(const char* s) {
>           - does s start with \\?\ ?
>         - yes:
>                     - os::malloc(strlen(s) + 1) and mbstowcs_s
>                 - no:
>                     - os::malloc(strlen(s) + 1 + 4), mbstowcs_s to
>         fourth position in string, prefix with L"\\?\"
>         }
>
>
>     I also include the case for adding  L"\\\\?\\UNC\0" at the
>     beginning to be consistent with libjava/canonicalize_md.c.
>
>
>         We also need error checking to mbstowcs_s.
>
>     I've added assert like the following after the call:
>
>     assert(converted_chars == path_len, "sanity");
>
>
>
> create_unc_path() :
>
> - could you convert the /**/ to // comments, please?
Fixed.
>
> - not sure about the assert after mbstowcs_s: if we happen to
> encounter an invalid multibyte character, function will fail and
> return an error:
>
> https://msdn.microsoft.com/en-us/library/eyktyxsx.aspx
> "If mbstowcs_s encounters an invalid multibyte character, it puts 0 in
> *``pReturnValue, sets the destination buffer to an empty string, sets
> errno to EILSEQ, and returns EILSEQ."
I've changed create_unc_path() so that the caller will get the errno and
removed the assert.

static wchar_t* create_unc_path(const char* path, errno_t &err)

>
>  As this is dependent on user data, we should not assert, but handle
> the return code of mbstowcs_s gracefully. Same goes for
> libjava/canonicalize_md.c.
>
> - Here: ::mbstowcs_s(&converted_chars, &wpath[7], path_len + 7, path,
> path_len);
> third parameter is wrong, as we hand in an offset into the buffer, we
> must decrement the buffer size by this offset, so correct would be
> path_len +7 - 7 or just path_len.
>
> - Same error below: + ::mbstowcs_s(&converted_chars, &wpath[4],
> path_len + 4, path, path_len);
Fixed in both places.

>
>
>
>         Just for cleanliness, I would then wrap deallocation into an
>         own function.
>
>         static viud destroy_unc_path(whchar_t* s) { os::free(s); }
>
>     I've added the destroy function.
>
>
>         These two functions could be candidates of putting into
>         os::win32 namespace, as this form of ANSI->UNC path
>         translation is quite common - whoever wants to use the xxxxW()
>         functions must do this.
>
>     I'm leaving them in os_windows.cpp since they're being used only
>     within that file.
>
>
> Fine by me.
>
>
>
>         -----------------------
>
>         FindFirstFileW:
>
>         I am pretty sure that you can achieve the same result with
>         GetFileAttributesExW(). It also returns WIN32_FIND_DATAW.
>
>     It actually returns WIN32_FILE_ATTRIBUTE_DATA and is very similar
>     to WIN32_FIND_DATAW.
>
>
>         It is way more straightforward to use than FindFirstFileW, as
>         it does not require you to write a callback hook.
>
>     I've switched to using GetFileAttributesExW().
>
>
> Thank you, this is way more readable.
> Another issue: do we still need the fix for 6539723 if we switch from
> stat() to GetFileAttributesExW()? This fix looks important, but the
> comment
> indicates that it could break things if the original bug is not present.
>
> Btw, this is another strong argument for scrapping ::stat() altogether
> on all code paths, not only for long input paths, because stat() and
> GetFileAttributesExW() may behave
> differently. So it would be good to use the same API on all code
> paths, in order to get the best test coverage.
For this round of change, I'm using GetFileAttributesEx[A|W]() for both
code paths.

webrev: http://cr.openjdk.java.net/~ccheung/8188122/webrev.03/

thanks,
Calvin

>
>
>
>         -------
>
>         eval_find_data(): This is more of a generic helper function,
>         could you rename this to something clearer, e.g.
>         make_double_word() ?
>
>     Ok. I've renamed it.
>
>
>         Also, setup_stat(), could this be renamed more clearly to
>         something like WIN32_FIND_DATA_to_stat? or lowercase if this
>         bothers you :)
>
>     I'm naming the function as file_attribute_data_to_stat() to match
>     with the data structure name.
>
>
> Thanks for taking my suggestions.
>
>
>
>         ==================================
>         src/hotspot/share/classfile/classLoader.cpp
>
>         In ClassPathDirEntry::open_stream(), I would feel better if we
>         were asserting _dir and name to be != NULL before feeding it
>         to strlen.
>
>     I've added an assert statement.
>
>
>         ===================
>
>     Here's an updated webrev:
>     http://cr.openjdk.java.net/~ccheung/8188122/webrev.02/
>     <http://cr.openjdk.java.net/%7Eccheung/8188122/webrev.02/>
>
>
> Thanks!
>
> Thomas
>
>     thanks,
>     Calvin
>
>
>         Thanks!
>
>         Thomas
>
>
>
>
>
>
>         On Wed, Oct 25, 2017 at 9:48 PM, Calvin Cheung
>         <[hidden email] <mailto:[hidden email]>
>         <mailto:[hidden email]
>         <mailto:[hidden email]>>> wrote:
>
>             I've reworked this fix by using the Unicode version of open
>             (wopen) to handle path name longer than max path with the path
>             prefixed to indicate an extended length path as described
>         in [1].
>
>             The Unicode version of stat (wstat) doesn't work well with
>         long
>             path [2]. So FindFirstFileW will be used.The data in
>             WIN32_FIND_DATA returned from FindFirstFileW needs to be
>             transferred to the stat structure since the caller expects a
>             return stat structure and other platforms return a stat
>         structure.
>
>             In classLoader.cpp, calculate the size of buffer required
>         instead
>             of limiting it to JVM_MAXPATHLEN.
>             In os_windows.cpp, dynamically allocate buffers in
>         os::open and
>             os::stat.
>
>             updated webrev:
>         http://cr.openjdk.java.net/~ccheung/8188122/webrev.01/
>         <http://cr.openjdk.java.net/%7Eccheung/8188122/webrev.01/>
>         <http://cr.openjdk.java.net/%7Eccheung/8188122/webrev.01/
>         <http://cr.openjdk.java.net/%7Eccheung/8188122/webrev.01/>>
>
>             It passed hs-tier2 testing using mach5.
>
>             thanks,
>             Calvin
>
>             [1]
>         https://msdn.microsoft.com/en-us/library/windows/desktop/aa365247(v=vs.85).aspx#MAX_PATH
>         <https://msdn.microsoft.com/en-us/library/windows/desktop/aa365247%28v=vs.85%29.aspx#MAX_PATH>
>         <https://msdn.microsoft.com/en-us/library/windows/desktop/aa365247%28v=vs.85%29.aspx#MAX_PATH
>         <https://msdn.microsoft.com/en-us/library/windows/desktop/aa365247%28v=vs.85%29.aspx#MAX_PATH>>
>
>             [2]
>         https://social.msdn.microsoft.com/Forums/vstudio/en-US/3c093ea9-f0aa-446d-b648-2dabe8480430/stat-and-long-names?forum=vcgeneral
>         <https://social.msdn.microsoft.com/Forums/vstudio/en-US/3c093ea9-f0aa-446d-b648-2dabe8480430/stat-and-long-names?forum=vcgeneral>
>         <https://social.msdn.microsoft.com/Forums/vstudio/en-US/3c093ea9-f0aa-446d-b648-2dabe8480430/stat-and-long-names?forum=vcgeneral
>         <https://social.msdn.microsoft.com/Forums/vstudio/en-US/3c093ea9-f0aa-446d-b648-2dabe8480430/stat-and-long-names?forum=vcgeneral>>
>
>
>
>             On 10/16/17, 3:15 PM, Calvin Cheung wrote:
>
>                 JBS: https://bugs.openjdk.java.net/browse/JDK-8188122
>         <https://bugs.openjdk.java.net/browse/JDK-8188122>
>         <https://bugs.openjdk.java.net/browse/JDK-8188122
>         <https://bugs.openjdk.java.net/browse/JDK-8188122>>
>
>                 Adding a warning message if the full path or the directory
>                 length exceeds MAX_PATH on windows.
>
>                 Example warning messages.
>
>                 1) The full path exceeds MAX_PATH:
>
>                 Java HotSpot(TM) 64-Bit Server VM warning: construct
>         full path
>                 name failed: total length 270 exceeds max length of 260
>                     dir
>                
>         T:\\testoutput\\jtreg\\JTwork\\classes\\2\\runtime\\LoadClass\\LongPath.d\\xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
>                 length 259
>                     name Hello.class length 11
>
>                 2) The directory path exceeds MAX_PATH:
>
>                 Java HotSpot(TM) 64-Bit Server VM warning: os::stat
>         failed:
>                 path length 265 exceeds max length 260
>                     path
>                
>         T:\\testoutput\\jtreg\\JTwork\\classes\\2\\runtime\\LoadClass\\LongPath.d\\xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx\\xxxxx
>
>                 webrev:
>         http://cr.openjdk.java.net/~ccheung/8188122/webrev.00/
>         <http://cr.openjdk.java.net/%7Eccheung/8188122/webrev.00/>
>         <http://cr.openjdk.java.net/%7Eccheung/8188122/webrev.00/
>         <http://cr.openjdk.java.net/%7Eccheung/8188122/webrev.00/>>
>
>                 Testing:
>                     JPRT (including the new test)
>
>                 thanks,
>                 Calvin
>
>
>
Reply | Threaded
Open this post in threaded view
|

Re: RFR(S): 8188122: Path length limits on Windows leads to obscure class loading failures

Thomas Stüfe-2
Hi Calvin,

On Wed, Nov 1, 2017 at 8:07 PM, Calvin Cheung <[hidden email]>
wrote:

>
>
> On 10/27/17, 12:55 AM, Thomas Stüfe wrote:
>
> Hi Calvin,
>
> On Fri, Oct 27, 2017 at 2:03 AM, Calvin Cheung <[hidden email]>
> wrote:
>
>> Hi Thomas,
>>
>> On 10/25/17, 11:54 PM, Thomas Stüfe wrote:
>>
>>> Hi Calvin,
>>>
>>> this is a worthwhile addition, thank you for your work!
>>>
>> Thanks for your review.
>
>
> Thanks for your work :)
>
> First off, there is another issue in file_attribute_data_to_stat(). We
> actually had this issue before, but your work uncovered it:
>
> According to POSIX (http://pubs.opengroup.org/
> onlinepubs/009695399/basedefs/sys/stat.h.html) and every unix manpage I
> looked at (solaris, linux, aix..), st_ctime is not the file creation time
> but the last time file status was changed. Only Microsoft gets this wrong
> in their posix layer, its stat() function returns  (
> https://msdn.microsoft.com/en-us/library/14h5k7ff.aspx) creation time in
> st_ctime.
>
> But as os::stat() is a platform independent layer, I'd say we should
> behave the same on all platforms, and that would be the Posix way.
>
> I did not find any code calling os::stat() and using st_ctime, so this is
> still save to change for windows. (Note that I found that
> perfMemory_xxx.cpp uses plain OS ::stat and misuses ctime as "creation
> time" - I opened a bug for that (https://bugs.openjdk.java.
> net/browse/JDK-8190260 - but that does not affect us, as they do not call
> os::stat()).
>
> There is one more complication: in os::stat() we use plain ::stat() in the
> single byte case.:
>
> *+ if (strlen(path) < MAX_PATH) {*
>
> *+     ret = ::stat(pathbuf, sbuf);**+   } else {*
>
>
> But ::stat() on Windows is broken, as I wrote above. We should not use it,
> especially not if we change the meaing of st_ctime in the double byte path.
>
> My suggestion would be to just always call GetFileAttributes(), also for
> the single byte path. A very simple solution would be to just always go the
> double byte path with UNC translation, regardless of the path length. Or,
> if you are worried about performance, for paths shorter than MAX_PATH we
> use GetFileAttributesA(), for longer paths GetFileAttributesW with UNC
> translation. In both cases you get a WIN32_FILE_ATTRIBUTE_DATA which you
> can feed tp your  file_attribute_data_to_stat() routine and have the struct
> stat you return be always consistent.
>
> I ran into an issue with the dwFileAttributes value for a jar file. On
> Windows Server O/S, the value is set to 0x2020 which is
> (FILE_ATTRIBUTE_NOT_CONTENT_INDEXED | FILE_ATTRIBUTE_ARCHIVE) but on a
> desktop O/S like Windows 7, it is set to 0x0020 which is just
> FILE_ATTRIBUTE_ARCHIVE. I've fixed it in file_attribute_data_to_stat().
>
>
Oh.. :( good catch! But that opens a new can of worms I did not see before:

FILE_ATTRIBUTE_NORMAL is documented as "A file that does not have other
attributes set. This attribute is valid only when used alone." In addition
to this flag, we have a multitude of things like FILE_ATTRIBUTE_ARCHIVE,
FILE_ATTRIBUTE_ENCRYPTED, FILE_ATTRIBUTE_READONLY ... etc, all cases where
we assume this is a normal file in Unix terminology and where we would
expect os::stat to return S_IFREG, but where according to the msdn doc
FILE_ATTRIBUTE_NORMAL is not set.

Looking at what others do in this scenario (Looked at mingw sources and at
ReactOS - I am not posting any source code here for legal reasons but feel
free to look for yourself), the usual way to translate
WIN32_FILE_ATTRIBUTES to struct stat seems to be:
"if FILE_ATTRIBUTE_DIRECTORY then set S_IFDIR else S_IFREG" (so, no
dependency on FILE_ATTRIBUTE_NORMAL).

I wonder about other special cases which should work too:

- read-only files should be S_IFREG and !S_IWRITE, read-only directories
should be S_IFDIR and !S_IWRITE.
- We should tread the device root ("C:\") as a directory (so,
os::stat("c:\") should return S_IFDIR). Does this work?


> I've also used GetTickCounts() to measure the performance of ::stat() vs
> GetFileAttributesA() plus file_attribute_data_to_stat(). There's no
> difference at the ms resolution. Using the high-resolution timestamp (
> https://msdn.microsoft.com/en-us/library/windows/desktop/
> dn553408(v=vs.85).aspx), it doesn't show much difference.
>
>
stat() seems to be implemented using FindFirstFile() (see crt sources if
you can), and we call GetFileAttributesA(), so I do not think this differs
much.


>
> But onward:
>
>
>>
>>> =========================
>>>
>>> src/hotspot/os/windows/os_windows.cpp
>>>
>>> Could you please use wchar_t instead of WCHAR?
>>>
>> Fixed.
>>
>>>
>>> ---------------
>>> in os::stat():
>>>
>>> This cumbersome malloc/strcpy sequence:
>>>
>>> !   size_t path_len = strlen(path) + 1;
>>> +   char* pathbuf = (char*)os::malloc(path_len * sizeof(char),
>>> mtInternal);
>>> +   if (pathbuf == NULL) {
>>> +     errno = ENOMEM;
>>>       return -1;
>>>     }
>>>     os::native_path(strcpy(pathbuf, path));
>>>
>>>  can be reduced to a simple strdup:
>>>
>>> char* pathbuf = os::strdup(path, mtInternal);
>>> if (pathbuf == NULL) {
>>>   errno = ENOMEM;
>>>   return -1;
>>> }
>>> os::native_path(pathbuf);
>>>
>>> This also would separate strcpy() from os::native_path() which is a bit
>>> unreadable.
>>>
>> I've made the change.
>>
>>>
>>> --------------------
>>>
>>> The single-byte path (strdup, ::stat()), together with its free(), can
>>> be moved inside the (strlen(path) < MAX_PATH) condition. os::native_path
>>> will not change the path length (it better not, as it operates on the input
>>> buffer). That avoids  having two allocations on the doublebyte path.
>>>
>>
>> os::native_path() converts a path like C:\\\\somedir to C:\\somedir
>> So I'll need the converted path in the wchar_t case too. The freeing of
>> the pathbuf is being done at the end of the function or in the middle of
>> the wchar_t case if there's an error.
>
>
> Oh, you are right,of course. I missed that it this is needed for both
> paths.
>
>
>>
>>
>>> -----------------------
>>>
>>> But seeing that translation to UNC paths is something we might want more
>>> often, I would combine allocation and UNC prefix adding to one function
>>> like this, to avoid the memove and increase reusability:
>>>
>>> // Creates an unc path from a single byte path. Return buffer is
>>> allocated in C heap and needs to be freed by caller. Returns NULL on error.
>>> static whchar_t* create_unc_path(const char* s) {
>>>   - does s start with \\?\ ?
>>> - yes:
>>>             - os::malloc(strlen(s) + 1) and mbstowcs_s
>>>         - no:
>>>             - os::malloc(strlen(s) + 1 + 4), mbstowcs_s to fourth
>>> position in string, prefix with L"\\?\"
>>> }
>>>
>>
>> I also include the case for adding  L"\\\\?\\UNC\0" at the beginning to
>> be consistent with libjava/canonicalize_md.c.
>
>
>>> We also need error checking to mbstowcs_s.
>>>
>> I've added assert like the following after the call:
>>
>> assert(converted_chars == path_len, "sanity");
>
>
>
> create_unc_path() :
>
> - could you convert the /**/ to // comments, please?
>
> Fixed.
>

thanks.


>
>
> - not sure about the assert after mbstowcs_s: if we happen to encounter an
> invalid multibyte character, function will fail and return an error:
>
> https://msdn.microsoft.com/en-us/library/eyktyxsx.aspx
> "If mbstowcs_s encounters an invalid multibyte character, it puts 0 in
> *``pReturnValue, sets the destination buffer to an empty string, sets errno
> to EILSEQ, and returns EILSEQ."
>
> I've changed create_unc_path() so that the caller will get the errno and
> removed the assert.
>
> static wchar_t* create_unc_path(const char* path, errno_t &err)
>

Okay, works for me.


>
>
>  As this is dependent on user data, we should not assert, but handle the
> return code of mbstowcs_s gracefully. Same goes for
> libjava/canonicalize_md.c.
>
> - Here: ::mbstowcs_s(&converted_chars, &wpath[7], path_len + 7, path,
> path_len);
> third parameter is wrong, as we hand in an offset into the buffer, we must
> decrement the buffer size by this offset, so correct would be path_len +7 -
> 7 or just path_len.
>
> - Same error below: + ::mbstowcs_s(&converted_chars, &wpath[4], path_len +
> 4, path, path_len);
>
> Fixed in both places.
>

Okay.


>
>
>
>
>>
>>
>>> Just for cleanliness, I would then wrap deallocation into an own
>>> function.
>>>
>>> static viud destroy_unc_path(whchar_t* s) { os::free(s); }
>>>
>> I've added the destroy function.
>>
>>>
>>> These two functions could be candidates of putting into os::win32
>>> namespace, as this form of ANSI->UNC path translation is quite common -
>>> whoever wants to use the xxxxW() functions must do this.
>>>
>> I'm leaving them in os_windows.cpp since they're being used only within
>> that file.
>
>
> Fine by me.
>
>
>>
>>
>>> -----------------------
>>>
>>> FindFirstFileW:
>>>
>>> I am pretty sure that you can achieve the same result with
>>> GetFileAttributesExW(). It also returns WIN32_FIND_DATAW.
>>>
>> It actually returns WIN32_FILE_ATTRIBUTE_DATA and is very similar to
>> WIN32_FIND_DATAW.
>>
>>>
>>> It is way more straightforward to use than FindFirstFileW, as it does
>>> not require you to write a callback hook.
>>>
>> I've switched to using GetFileAttributesExW().
>
>
> Thank you, this is way more readable.
>
> Another issue: do we still need the fix for 6539723 if we switch from
> stat() to GetFileAttributesExW()? This fix looks important, but the
> comment
> indicates that it could break things if the original bug is not present.
>
> Btw, this is another strong argument for scrapping ::stat() altogether on
> all code paths, not only for long input paths, because stat() and
> GetFileAttributesExW() may behave
> differently. So it would be good to use the same API on all code paths, in
> order to get the best test coverage.
>
> For this round of change, I'm using GetFileAttributesEx[A|W]() for both
> code paths.
>
> webrev: http://cr.openjdk.java.net/~ccheung/8188122/webrev.03/
>
> thanks,
> Calvin
>

Okay, all good apart from the issues mentioned above. Thanks for your work!

Best Regards, Thomas



>
>
>>
>>> -------
>>>
>>> eval_find_data(): This is more of a generic helper function, could you
>>> rename this to something clearer, e.g. make_double_word() ?
>>>
>> Ok. I've renamed it.
>>
>>>
>>> Also, setup_stat(), could this be renamed more clearly to something like
>>> WIN32_FIND_DATA_to_stat? or lowercase if this bothers you :)
>>>
>> I'm naming the function as file_attribute_data_to_stat() to match with
>> the data structure name.
>
>
> Thanks for taking my suggestions.
>
>
>>
>>
>>> ==================================
>>> src/hotspot/share/classfile/classLoader.cpp
>>>
>>> In ClassPathDirEntry::open_stream(), I would feel better if we were
>>> asserting _dir and name to be != NULL before feeding it to strlen.
>>>
>> I've added an assert statement.
>>
>>>
>>> ===================
>>>
>> Here's an updated webrev:
>>     http://cr.openjdk.java.net/~ccheung/8188122/webrev.02/
>>
>>
> Thanks!
>
> Thomas
>
>
>> thanks,
>> Calvin
>>
>>
>>> Thanks!
>>>
>>> Thomas
>>>
>>>
>>>
>>>
>>>
>>>
>>> On Wed, Oct 25, 2017 at 9:48 PM, Calvin Cheung <[hidden email]
>>> <mailto:[hidden email]>> wrote:
>>>
>>>     I've reworked this fix by using the Unicode version of open
>>>     (wopen) to handle path name longer than max path with the path
>>>     prefixed to indicate an extended length path as described in [1].
>>>
>>>     The Unicode version of stat (wstat) doesn't work well with long
>>>     path [2]. So FindFirstFileW will be used.The data in
>>>     WIN32_FIND_DATA returned from FindFirstFileW needs to be
>>>     transferred to the stat structure since the caller expects a
>>>     return stat structure and other platforms return a stat structure.
>>>
>>>     In classLoader.cpp, calculate the size of buffer required instead
>>>     of limiting it to JVM_MAXPATHLEN.
>>>     In os_windows.cpp, dynamically allocate buffers in os::open and
>>>     os::stat.
>>>
>>>     updated webrev:
>>>     http://cr.openjdk.java.net/~ccheung/8188122/webrev.01/
>>>     <http://cr.openjdk.java.net/%7Eccheung/8188122/webrev.01/>
>>>
>>>     It passed hs-tier2 testing using mach5.
>>>
>>>     thanks,
>>>     Calvin
>>>
>>>     [1]
>>>     https://msdn.microsoft.com/en-us/library/windows/desktop/aa3
>>> 65247(v=vs.85).aspx#MAX_PATH
>>>     <https://msdn.microsoft.com/en-us/library/windows/desktop/aa
>>> 365247%28v=vs.85%29.aspx#MAX_PATH>
>>>
>>>     [2]
>>>     https://social.msdn.microsoft.com/Forums/vstudio/en-US/3c093
>>> ea9-f0aa-446d-b648-2dabe8480430/stat-and-long-names?forum=vcgeneral
>>>     <https://social.msdn.microsoft.com/Forums/vstudio/en-US/3c09
>>> 3ea9-f0aa-446d-b648-2dabe8480430/stat-and-long-names?forum=vcgeneral>
>>>
>>>
>>>
>>>     On 10/16/17, 3:15 PM, Calvin Cheung wrote:
>>>
>>>         JBS: https://bugs.openjdk.java.net/browse/JDK-8188122
>>>         <https://bugs.openjdk.java.net/browse/JDK-8188122>
>>>
>>>         Adding a warning message if the full path or the directory
>>>         length exceeds MAX_PATH on windows.
>>>
>>>         Example warning messages.
>>>
>>>         1) The full path exceeds MAX_PATH:
>>>
>>>         Java HotSpot(TM) 64-Bit Server VM warning: construct full path
>>>         name failed: total length 270 exceeds max length of 260
>>>             dir
>>>         T:\\testoutput\\jtreg\\JTwork\\classes\\2\\runtime\\LoadClas
>>> s\\LongPath.d\\xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
>>> xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
>>> xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
>>> xxxxxxxxxxxxxxxxxxxxxxxxxxxx
>>>         length 259
>>>             name Hello.class length 11
>>>
>>>         2) The directory path exceeds MAX_PATH:
>>>
>>>         Java HotSpot(TM) 64-Bit Server VM warning: os::stat failed:
>>>         path length 265 exceeds max length 260
>>>             path
>>>         T:\\testoutput\\jtreg\\JTwork\\classes\\2\\runtime\\LoadClas
>>> s\\LongPath.d\\xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
>>> xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
>>> xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
>>> xxxxxxxxxxxxxxxxxxxxxxxxxxxx\\xxxxx
>>>
>>>         webrev:
>>>         http://cr.openjdk.java.net/~ccheung/8188122/webrev.00/
>>>         <http://cr.openjdk.java.net/%7Eccheung/8188122/webrev.00/>
>>>
>>>         Testing:
>>>             JPRT (including the new test)
>>>
>>>         thanks,
>>>         Calvin
>>>
>>>
>>>
>
Reply | Threaded
Open this post in threaded view
|

Re: RFR(S): 8188122: Path length limits on Windows leads to obscure class loading failures

Calvin Cheung


On 11/7/17, 6:12 AM, Thomas Stüfe wrote:

> Hi Calvin,
>
> On Wed, Nov 1, 2017 at 8:07 PM, Calvin Cheung
> <[hidden email] <mailto:[hidden email]>> wrote:
>
>
>
>     On 10/27/17, 12:55 AM, Thomas Stüfe wrote:
>>     Hi Calvin,
>>
>>     On Fri, Oct 27, 2017 at 2:03 AM, Calvin Cheung
>>     <[hidden email] <mailto:[hidden email]>> wrote:
>>
>>         Hi Thomas,
>>
>>         On 10/25/17, 11:54 PM, Thomas Stüfe wrote:
>>
>>             Hi Calvin,
>>
>>             this is a worthwhile addition, thank you for your work!
>>
>>         Thanks for your review.
>>
>>
>>     Thanks for your work :)
>>
>>     First off, there is another issue in
>>     file_attribute_data_to_stat(). We actually had this issue before,
>>     but your work uncovered it:
>>
>>     According to POSIX
>>     (http://pubs.opengroup.org/onlinepubs/009695399/basedefs/sys/stat.h.html
>>     <http://pubs.opengroup.org/onlinepubs/009695399/basedefs/sys/stat.h.html>)
>>     and every unix manpage I looked at (solaris, linux, aix..),
>>     st_ctime is not the file creation time but the last time file
>>     status was changed. Only Microsoft gets this wrong in their posix
>>     layer, its stat() function returns
>>      (https://msdn.microsoft.com/en-us/library/14h5k7ff.aspx
>>     <https://msdn.microsoft.com/en-us/library/14h5k7ff.aspx>)
>>     creation time in st_ctime.
>>
>>     But as os::stat() is a platform independent layer, I'd say we
>>     should behave the same on all platforms, and that would be the
>>     Posix way.
>>
>>     I did not find any code calling os::stat() and using st_ctime, so
>>     this is still save to change for windows. (Note that I found that
>>     perfMemory_xxx.cpp uses plain OS ::stat and misuses ctime as
>>     "creation time" - I opened a bug for that
>>     (https://bugs.openjdk.java.net/browse/JDK-8190260
>>     <https://bugs.openjdk.java.net/browse/JDK-8190260> - but that
>>     does not affect us, as they do not call os::stat()).
>>
>>     There is one more complication: in os::stat() we use plain
>>     ::stat() in the single byte case.:
>>
>>     *+ if (strlen(path) < MAX_PATH) {*
>>     *+     ret = ::stat(pathbuf, sbuf);*
>>     *+   } else {*
>>     *
>>     *
>>     But ::stat() on Windows is broken, as I wrote above. We should
>>     not use it, especially not if we change the meaing of st_ctime in
>>     the double byte path.
>>
>>     My suggestion would be to just always call GetFileAttributes(),
>>     also for the single byte path. A very simple solution would be to
>>     just always go the double byte path with UNC translation,
>>     regardless of the path length. Or, if you are worried about
>>     performance, for paths shorter than MAX_PATH we use
>>     GetFileAttributesA(), for longer paths GetFileAttributesW with
>>     UNC translation. In both cases you get
>>     a WIN32_FILE_ATTRIBUTE_DATA which you can feed tp your
>>     file_attribute_data_to_stat() routine and have the struct stat
>>     you return be always consistent.
>     I ran into an issue with the dwFileAttributes value for a jar
>     file. On Windows Server O/S, the value is set to 0x2020 which is
>     (FILE_ATTRIBUTE_NOT_CONTENT_INDEXED | FILE_ATTRIBUTE_ARCHIVE) but
>     on a desktop O/S like Windows 7, it is set to 0x0020 which is just
>     FILE_ATTRIBUTE_ARCHIVE. I've fixed it in
>     file_attribute_data_to_stat().
>
>
> Oh.. :( good catch! But that opens a new can of worms I did not see
> before:
>
> FILE_ATTRIBUTE_NORMAL is documented as "A file that does not have
> other attributes set. This attribute is valid only when used alone."
> In addition to this flag, we have a multitude of things like
> FILE_ATTRIBUTE_ARCHIVE, FILE_ATTRIBUTE_ENCRYPTED,
> FILE_ATTRIBUTE_READONLY ... etc, all cases where we assume this is a
> normal file in Unix terminology and where we would expect os::stat to
> return S_IFREG, but where according to the msdn doc
> FILE_ATTRIBUTE_NORMAL is not set.
>
> Looking at what others do in this scenario (Looked at mingw sources
> and at ReactOS - I am not posting any source code here for legal
> reasons but feel free to look for yourself), the usual way to
> translate WIN32_FILE_ATTRIBUTES to struct stat seems to be:
> "if FILE_ATTRIBUTE_DIRECTORY then set S_IFDIR else S_IFREG" (so, no
> dependency on FILE_ATTRIBUTE_NORMAL).
This makes sense but I ran into similar problem as before - the
dwFileAttributes has a different value on a windows server O/S vs
desktop O/S. So I need to do the check as follows:

+   // A directory has the dwFileAttributes value of 0x2010 which is
+   // (FILE_ATTRIBUTE_NOT_CONTENT_INDEXED | FILE_ATTRIBUTE_ARCHIVE)
+   // on Windows Server O/S but the value is 0x0010 on Windows desktop O/S
+   // such as Windows 7.
+   if ((file_data.dwFileAttributes&  FILE_ATTRIBUTE_DIRECTORY) != 0) {
+     sbuf->st_mode |= S_IFDIR;
+   } else {
+     sbuf->st_mode |= S_IFREG;
+   }

updated webrev:
     http://cr.openjdk.java.net/~ccheung/8188122/webrev.04/

Diff from webrev.03:

< --- old/src/hotspot/os/windows/os_windows.cpp 2017-11-08
08:50:40.170786397 -0800
< +++ new/src/hotspot/os/windows/os_windows.cpp 2017-11-08
08:50:39.803751877 -0800
< @@ -4060,41 +4060,119 @@
---
 > --- old/src/hotspot/os/windows/os_windows.cpp 2017-11-01
09:40:13.657460425 -0700
 > +++ new/src/hotspot/os/windows/os_windows.cpp 2017-11-01
09:40:13.261423192 -0700
 > @@ -4060,41 +4060,121 @@
25,29c25
< +  // A directory has the dwFileAttributes value of 0x2010 which is
< +  // (FILE_ATTRIBUTE_NOT_CONTENT_INDEXED | FILE_ATTRIBUTE_ARCHIVE)
< +  // on Windows Server O/S but the value is 0x0010 on Windows desktop O/S
< +  // such as Windows 7.
< +  if ((file_data.dwFileAttributes & FILE_ATTRIBUTE_DIRECTORY) != 0) {
---
 > +  if (file_data.dwFileAttributes == FILE_ATTRIBUTE_DIRECTORY) {
31c27,33
< +  } else {
---
 > +  }
 > +  if ((file_data.dwFileAttributes == FILE_ATTRIBUTE_NORMAL) ||
 > +      // an archive file such as a jar file has the dwFileAttributes
value of
 > +      // 0x2020 (FILE_ATTRIBUTE_NOT_CONTENT_INDEXED |
FILE_ATTRIBUTE_ARCHIVE)
 > +      // on Windows Server O/S but the value is 0x0020 on
 > +      // Windows desktop O/S such as Windows 7.
 > +      ((file_data.dwFileAttributes & FILE_ATTRIBUTE_ARCHIVE) != 0)) {

>
> I wonder about other special cases which should work too:
> - read-only files should be S_IFREG and !S_IWRITE,
For a read-only system file under the user's home dir.

st_mode & 0xFF00 = 0x8100 = S_IFREG | S_IREAD
dwFileAttributes = 39 = (FILE_ATTRIBUTE_ARCHIVE | FILE_ATTRIBUTE_HIDDEN
| FILE_ATTRIBUTE_SYSTEM | FILE_ATTRIBUTE_READONLY)
> read-only directories should be S_IFDIR and !S_IWRITE.
I've tried the C:\progra~1 dir.

st_mode & 0xFF00 = 0x4100 = S_IFDIR | S_IREAD
dwFileAttributes = 17 = (FILE_ATTRIBUTE_DIRECTORY | FILE_ATTRIBUTE_READONLY)
> - We should tread the device root ("C:\") as a directory (so,
> os::stat("c:\") should return S_IFDIR). Does this work?
This one works too.

st_mode & 0xFF00 = 0x4100 = S_IFDIR | S_IREAD
dwFileAttributes = 22 = (FILE_ATTRIBUTE_DIRECTORY |
FILE_ATTRIBUTE_HIDDEN | FILE_ATTRIBUTE_SYSTEM)

>
>     I've also used GetTickCounts() to measure the performance of
>     ::stat() vs GetFileAttributesA() plus
>     file_attribute_data_to_stat(). There's no difference at the ms
>     resolution. Using the high-resolution timestamp
>     (https://msdn.microsoft.com/en-us/library/windows/desktop/dn553408(v=vs.85).aspx)
>     <https://msdn.microsoft.com/en-us/library/windows/desktop/dn553408%28v=vs.85%29.aspx%29>,
>     it doesn't show much difference.
>
>
> stat() seems to be implemented using FindFirstFile() (see crt sources
> if you can), and we call GetFileAttributesA(), so I do not think this
> differs much.
Yes, I saw the same in my Visual Studio installation.

thanks,
Calvin

>
>>
>>     But onward:
>>
>>
>>
>>             =========================
>>
>>             src/hotspot/os/windows/os_windows.cpp
>>
>>             Could you please use wchar_t instead of WCHAR?
>>
>>         Fixed.
>>
>>
>>             ---------------
>>             in os::stat():
>>
>>             This cumbersome malloc/strcpy sequence:
>>
>>             !   size_t path_len = strlen(path) + 1;
>>             +   char* pathbuf = (char*)os::malloc(path_len *
>>             sizeof(char), mtInternal);
>>             +   if (pathbuf == NULL) {
>>             +     errno = ENOMEM;
>>                   return -1;
>>                 }
>>                 os::native_path(strcpy(pathbuf, path));
>>
>>              can be reduced to a simple strdup:
>>
>>             char* pathbuf = os::strdup(path, mtInternal);
>>             if (pathbuf == NULL) {
>>               errno = ENOMEM;
>>               return -1;
>>             }
>>             os::native_path(pathbuf);
>>
>>             This also would separate strcpy() from os::native_path()
>>             which is a bit unreadable.
>>
>>         I've made the change.
>>
>>
>>             --------------------
>>
>>             The single-byte path (strdup, ::stat()), together with
>>             its free(), can be moved inside the (strlen(path) <
>>             MAX_PATH) condition. os::native_path will not change the
>>             path length (it better not, as it operates on the input
>>             buffer). That avoids  having two allocations on the
>>             doublebyte path.
>>
>>
>>         os::native_path() converts a path like C:\\\\somedir to
>>         C:\\somedir
>>         So I'll need the converted path in the wchar_t case too. The
>>         freeing of the pathbuf is being done at the end of the
>>         function or in the middle of the wchar_t case if there's an
>>         error.
>>
>>
>>     Oh, you are right,of course. I missed that it this is needed for
>>     both paths.
>>
>>
>>
>>             -----------------------
>>
>>             But seeing that translation to UNC paths is something we
>>             might want more often, I would combine allocation and UNC
>>             prefix adding to one function like this, to avoid the
>>             memove and increase reusability:
>>
>>             // Creates an unc path from a single byte path. Return
>>             buffer is allocated in C heap and needs to be freed by
>>             caller. Returns NULL on error.
>>             static whchar_t* create_unc_path(const char* s) {
>>               - does s start with \\?\ ?
>>             - yes:
>>                         - os::malloc(strlen(s) + 1) and mbstowcs_s
>>                     - no:
>>                         - os::malloc(strlen(s) + 1 + 4), mbstowcs_s
>>             to fourth position in string, prefix with L"\\?\"
>>             }
>>
>>
>>         I also include the case for adding  L"\\\\?\\UNC\0" at the
>>         beginning to be consistent with libjava/canonicalize_md.c.
>>
>>
>>             We also need error checking to mbstowcs_s.
>>
>>         I've added assert like the following after the call:
>>
>>         assert(converted_chars == path_len, "sanity");
>>
>>
>>
>>     create_unc_path() :
>>
>>     - could you convert the /**/ to // comments, please?
>     Fixed.
>
>
> thanks.
>
>
>>
>>     - not sure about the assert after mbstowcs_s: if we happen to
>>     encounter an invalid multibyte character, function will fail and
>>     return an error:
>>
>>     https://msdn.microsoft.com/en-us/library/eyktyxsx.aspx
>>     <https://msdn.microsoft.com/en-us/library/eyktyxsx.aspx>
>>     "If mbstowcs_s encounters an invalid multibyte character, it puts
>>     0 in *``pReturnValue, sets the destination buffer to an empty
>>     string, sets errno to EILSEQ, and returns EILSEQ."
>     I've changed create_unc_path() so that the caller will get the
>     errno and removed the assert.
>
>     static wchar_t* create_unc_path(const char* path, errno_t &err)
>
>
> Okay, works for me.
>
>
>>
>>      As this is dependent on user data, we should not assert, but
>>     handle the return code of mbstowcs_s gracefully. Same goes for
>>     libjava/canonicalize_md.c.
>>
>>     - Here: ::mbstowcs_s(&converted_chars, &wpath[7], path_len + 7,
>>     path, path_len);
>>     third parameter is wrong, as we hand in an offset into the
>>     buffer, we must decrement the buffer size by this offset, so
>>     correct would be path_len +7 - 7 or just path_len.
>>
>>     - Same error below: + ::mbstowcs_s(&converted_chars, &wpath[4],
>>     path_len + 4, path, path_len);
>     Fixed in both places.
>
>
> Okay.
>
>
>>
>>
>>             Just for cleanliness, I would then wrap deallocation into
>>             an own function.
>>
>>             static viud destroy_unc_path(whchar_t* s) { os::free(s); }
>>
>>         I've added the destroy function.
>>
>>
>>             These two functions could be candidates of putting into
>>             os::win32 namespace, as this form of ANSI->UNC path
>>             translation is quite common - whoever wants to use the
>>             xxxxW() functions must do this.
>>
>>         I'm leaving them in os_windows.cpp since they're being used
>>         only within that file.
>>
>>
>>     Fine by me.
>>
>>
>>
>>             -----------------------
>>
>>             FindFirstFileW:
>>
>>             I am pretty sure that you can achieve the same result
>>             with GetFileAttributesExW(). It also returns
>>             WIN32_FIND_DATAW.
>>
>>         It actually returns WIN32_FILE_ATTRIBUTE_DATA and is very
>>         similar to WIN32_FIND_DATAW.
>>
>>
>>             It is way more straightforward to use than
>>             FindFirstFileW, as it does not require you to write a
>>             callback hook.
>>
>>         I've switched to using GetFileAttributesExW().
>>
>>
>>     Thank you, this is way more readable.
>>     Another issue: do we still need the fix for 6539723 if we switch
>>     from stat() to GetFileAttributesExW()? This fix looks important,
>>     but the comment
>>     indicates that it could break things if the original bug is not
>>     present.
>>
>>     Btw, this is another strong argument for scrapping ::stat()
>>     altogether on all code paths, not only for long input paths,
>>     because stat() and GetFileAttributesExW() may behave
>>     differently. So it would be good to use the same API on all code
>>     paths, in order to get the best test coverage.
>     For this round of change, I'm using GetFileAttributesEx[A|W]() for
>     both code paths.
>
>     webrev: http://cr.openjdk.java.net/~ccheung/8188122/webrev.03/
>     <http://cr.openjdk.java.net/%7Eccheung/8188122/webrev.03/>
>
>     thanks,
>     Calvin
>
>
> Okay, all good apart from the issues mentioned above. Thanks for your
> work!
>
> Best Regards, Thomas
>
>
>
>>
>>
>>
>>             -------
>>
>>             eval_find_data(): This is more of a generic helper
>>             function, could you rename this to something clearer,
>>             e.g. make_double_word() ?
>>
>>         Ok. I've renamed it.
>>
>>
>>             Also, setup_stat(), could this be renamed more clearly to
>>             something like WIN32_FIND_DATA_to_stat? or lowercase if
>>             this bothers you :)
>>
>>         I'm naming the function as file_attribute_data_to_stat() to
>>         match with the data structure name.
>>
>>
>>     Thanks for taking my suggestions.
>>
>>
>>
>>             ==================================
>>             src/hotspot/share/classfile/classLoader.cpp
>>
>>             In ClassPathDirEntry::open_stream(), I would feel better
>>             if we were asserting _dir and name to be != NULL before
>>             feeding it to strlen.
>>
>>         I've added an assert statement.
>>
>>
>>             ===================
>>
>>         Here's an updated webrev:
>>         http://cr.openjdk.java.net/~ccheung/8188122/webrev.02/
>>         <http://cr.openjdk.java.net/%7Eccheung/8188122/webrev.02/>
>>
>>
>>     Thanks!
>>
>>     Thomas
>>
>>         thanks,
>>         Calvin
>>
>>
>>             Thanks!
>>
>>             Thomas
>>
>>
>>
>>
>>
>>
>>             On Wed, Oct 25, 2017 at 9:48 PM, Calvin Cheung
>>             <[hidden email]
>>             <mailto:[hidden email]>
>>             <mailto:[hidden email]
>>             <mailto:[hidden email]>>> wrote:
>>
>>                 I've reworked this fix by using the Unicode version
>>             of open
>>                 (wopen) to handle path name longer than max path with
>>             the path
>>                 prefixed to indicate an extended length path as
>>             described in [1].
>>
>>                 The Unicode version of stat (wstat) doesn't work well
>>             with long
>>                 path [2]. So FindFirstFileW will be used.The data in
>>                 WIN32_FIND_DATA returned from FindFirstFileW needs to be
>>                 transferred to the stat structure since the caller
>>             expects a
>>                 return stat structure and other platforms return a
>>             stat structure.
>>
>>                 In classLoader.cpp, calculate the size of buffer
>>             required instead
>>                 of limiting it to JVM_MAXPATHLEN.
>>                 In os_windows.cpp, dynamically allocate buffers in
>>             os::open and
>>                 os::stat.
>>
>>                 updated webrev:
>>             http://cr.openjdk.java.net/~ccheung/8188122/webrev.01/
>>             <http://cr.openjdk.java.net/%7Eccheung/8188122/webrev.01/>
>>             <http://cr.openjdk.java.net/%7Eccheung/8188122/webrev.01/
>>             <http://cr.openjdk.java.net/%7Eccheung/8188122/webrev.01/>>
>>
>>                 It passed hs-tier2 testing using mach5.
>>
>>                 thanks,
>>                 Calvin
>>
>>                 [1]
>>             https://msdn.microsoft.com/en-us/library/windows/desktop/aa365247(v=vs.85).aspx#MAX_PATH
>>             <https://msdn.microsoft.com/en-us/library/windows/desktop/aa365247%28v=vs.85%29.aspx#MAX_PATH>
>>             <https://msdn.microsoft.com/en-us/library/windows/desktop/aa365247%28v=vs.85%29.aspx#MAX_PATH
>>             <https://msdn.microsoft.com/en-us/library/windows/desktop/aa365247%28v=vs.85%29.aspx#MAX_PATH>>
>>
>>                 [2]
>>             https://social.msdn.microsoft.com/Forums/vstudio/en-US/3c093ea9-f0aa-446d-b648-2dabe8480430/stat-and-long-names?forum=vcgeneral
>>             <https://social.msdn.microsoft.com/Forums/vstudio/en-US/3c093ea9-f0aa-446d-b648-2dabe8480430/stat-and-long-names?forum=vcgeneral>
>>             <https://social.msdn.microsoft.com/Forums/vstudio/en-US/3c093ea9-f0aa-446d-b648-2dabe8480430/stat-and-long-names?forum=vcgeneral
>>             <https://social.msdn.microsoft.com/Forums/vstudio/en-US/3c093ea9-f0aa-446d-b648-2dabe8480430/stat-and-long-names?forum=vcgeneral>>
>>
>>
>>
>>                 On 10/16/17, 3:15 PM, Calvin Cheung wrote:
>>
>>                     JBS:
>>             https://bugs.openjdk.java.net/browse/JDK-8188122
>>             <https://bugs.openjdk.java.net/browse/JDK-8188122>
>>             <https://bugs.openjdk.java.net/browse/JDK-8188122
>>             <https://bugs.openjdk.java.net/browse/JDK-8188122>>
>>
>>                     Adding a warning message if the full path or the
>>             directory
>>                     length exceeds MAX_PATH on windows.
>>
>>                     Example warning messages.
>>
>>                     1) The full path exceeds MAX_PATH:
>>
>>                     Java HotSpot(TM) 64-Bit Server VM warning:
>>             construct full path
>>                     name failed: total length 270 exceeds max length
>>             of 260
>>                         dir
>>                    
>>             T:\\testoutput\\jtreg\\JTwork\\classes\\2\\runtime\\LoadClass\\LongPath.d\\xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
>>                     length 259
>>                         name Hello.class length 11
>>
>>                     2) The directory path exceeds MAX_PATH:
>>
>>                     Java HotSpot(TM) 64-Bit Server VM warning:
>>             os::stat failed:
>>                     path length 265 exceeds max length 260
>>                         path
>>                    
>>             T:\\testoutput\\jtreg\\JTwork\\classes\\2\\runtime\\LoadClass\\LongPath.d\\xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx\\xxxxx
>>
>>                     webrev:
>>             http://cr.openjdk.java.net/~ccheung/8188122/webrev.00/
>>             <http://cr.openjdk.java.net/%7Eccheung/8188122/webrev.00/>
>>             <http://cr.openjdk.java.net/%7Eccheung/8188122/webrev.00/
>>             <http://cr.openjdk.java.net/%7Eccheung/8188122/webrev.00/>>
>>
>>                     Testing:
>>                         JPRT (including the new test)
>>
>>                     thanks,
>>                     Calvin
>>
>>
>>
>
Reply | Threaded
Open this post in threaded view
|

Re: RFR(S): 8188122: Path length limits on Windows leads to obscure class loading failures

Thomas Stüfe-2
Hi Calvin,

On Wed, Nov 8, 2017 at 6:27 PM, Calvin Cheung <[hidden email]>
wrote:

>
>
> On 11/7/17, 6:12 AM, Thomas Stüfe wrote:
>
> Hi Calvin,
>
> On Wed, Nov 1, 2017 at 8:07 PM, Calvin Cheung <[hidden email]>
> wrote:
>
>>
>>
>> On 10/27/17, 12:55 AM, Thomas Stüfe wrote:
>>
>> Hi Calvin,
>>
>> On Fri, Oct 27, 2017 at 2:03 AM, Calvin Cheung <[hidden email]>
>> wrote:
>>
>>> Hi Thomas,
>>>
>>> On 10/25/17, 11:54 PM, Thomas Stüfe wrote:
>>>
>>>> Hi Calvin,
>>>>
>>>> this is a worthwhile addition, thank you for your work!
>>>>
>>> Thanks for your review.
>>
>>
>> Thanks for your work :)
>>
>> First off, there is another issue in file_attribute_data_to_stat(). We
>> actually had this issue before, but your work uncovered it:
>>
>> According to POSIX (http://pubs.opengroup.org/onl
>> inepubs/009695399/basedefs/sys/stat.h.html) and every unix manpage I
>> looked at (solaris, linux, aix..), st_ctime is not the file creation time
>> but the last time file status was changed. Only Microsoft gets this wrong
>> in their posix layer, its stat() function returns  (
>> https://msdn.microsoft.com/en-us/library/14h5k7ff.aspx) creation time in
>> st_ctime.
>>
>> But as os::stat() is a platform independent layer, I'd say we should
>> behave the same on all platforms, and that would be the Posix way.
>>
>> I did not find any code calling os::stat() and using st_ctime, so this is
>> still save to change for windows. (Note that I found that
>> perfMemory_xxx.cpp uses plain OS ::stat and misuses ctime as "creation
>> time" - I opened a bug for that (https://bugs.openjdk.java.net
>> /browse/JDK-8190260 - but that does not affect us, as they do not call
>> os::stat()).
>>
>> There is one more complication: in os::stat() we use plain ::stat() in
>> the single byte case.:
>>
>> *+ if (strlen(path) < MAX_PATH) {*
>>
>> *+     ret = ::stat(pathbuf, sbuf);**+   } else {*
>>
>>
>> But ::stat() on Windows is broken, as I wrote above. We should not use
>> it, especially not if we change the meaing of st_ctime in the double byte
>> path.
>>
>> My suggestion would be to just always call GetFileAttributes(), also for
>> the single byte path. A very simple solution would be to just always go the
>> double byte path with UNC translation, regardless of the path length. Or,
>> if you are worried about performance, for paths shorter than MAX_PATH we
>> use GetFileAttributesA(), for longer paths GetFileAttributesW with UNC
>> translation. In both cases you get a WIN32_FILE_ATTRIBUTE_DATA which you
>> can feed tp your  file_attribute_data_to_stat() routine and have the struct
>> stat you return be always consistent.
>>
>> I ran into an issue with the dwFileAttributes value for a jar file. On
>> Windows Server O/S, the value is set to 0x2020 which is
>> (FILE_ATTRIBUTE_NOT_CONTENT_INDEXED | FILE_ATTRIBUTE_ARCHIVE) but on a
>> desktop O/S like Windows 7, it is set to 0x0020 which is just
>> FILE_ATTRIBUTE_ARCHIVE. I've fixed it in file_attribute_data_to_stat().
>>
>>
> Oh.. :( good catch! But that opens a new can of worms I did not see before:
>
> FILE_ATTRIBUTE_NORMAL is documented as "A file that does not have other
> attributes set. This attribute is valid only when used alone." In addition
> to this flag, we have a multitude of things like FILE_ATTRIBUTE_ARCHIVE,
> FILE_ATTRIBUTE_ENCRYPTED, FILE_ATTRIBUTE_READONLY ... etc, all cases where
> we assume this is a normal file in Unix terminology and where we would
> expect os::stat to return S_IFREG, but where according to the msdn doc
> FILE_ATTRIBUTE_NORMAL is not set.
>
> Looking at what others do in this scenario (Looked at mingw sources and at
> ReactOS - I am not posting any source code here for legal reasons but feel
> free to look for yourself), the usual way to translate
> WIN32_FILE_ATTRIBUTES to struct stat seems to be:
> "if FILE_ATTRIBUTE_DIRECTORY then set S_IFDIR else S_IFREG" (so, no
> dependency on FILE_ATTRIBUTE_NORMAL).
>
> This makes sense but I ran into similar problem as before - the
> dwFileAttributes has a different value on a windows server O/S vs desktop
> O/S. So I need to do the check as follows:
>
> +   // A directory has the dwFileAttributes value of 0x2010 which is+   // (FILE_ATTRIBUTE_NOT_CONTENT_INDEXED | FILE_ATTRIBUTE_ARCHIVE)+   // on Windows Server O/S but the value is 0x0010 on Windows desktop O/S+   // such as Windows 7.+   if ((file_data.dwFileAttributes & FILE_ATTRIBUTE_DIRECTORY) != 0) {+     sbuf->st_mode |= S_IFDIR;+   } else {+     sbuf->st_mode |= S_IFREG;+   }
>
> I scratched my head a bit about the comment till I understood that you
mean "if FILE_ATTRIBUTE_DIRECTORY bit is set it is a directory regardless
of which other flags are set" instead of "if
flags==FILE_ATTRIBUTE_DIRECTORY it is a directory". Sure, I guess my
comment above was sloppy, but this was what I meant. I am not even sure the
comment is needed, this is quite self-explaining.



> updated webrev:
>     http://cr.openjdk.java.net/~ccheung/8188122/webrev.04/
>
>
I am fine with all the changes now. Thank you for your work and patience.

Kind Regards, Thomas






> Diff from webrev.03:
>
> < --- old/src/hotspot/os/windows/os_windows.cpp 2017-11-08
> 08:50:40.170786397 -0800
> < +++ new/src/hotspot/os/windows/os_windows.cpp 2017-11-08
> 08:50:39.803751877 -0800
> < @@ -4060,41 +4060,119 @@
> ---
> > --- old/src/hotspot/os/windows/os_windows.cpp 2017-11-01
> 09:40:13.657460425 -0700
> > +++ new/src/hotspot/os/windows/os_windows.cpp 2017-11-01
> 09:40:13.261423192 -0700
> > @@ -4060,41 +4060,121 @@
> 25,29c25
> < +  // A directory has the dwFileAttributes value of 0x2010 which is
> < +  // (FILE_ATTRIBUTE_NOT_CONTENT_INDEXED | FILE_ATTRIBUTE_ARCHIVE)
> < +  // on Windows Server O/S but the value is 0x0010 on Windows desktop
> O/S
> < +  // such as Windows 7.
> < +  if ((file_data.dwFileAttributes & FILE_ATTRIBUTE_DIRECTORY) != 0) {
> ---
> > +  if (file_data.dwFileAttributes == FILE_ATTRIBUTE_DIRECTORY) {
> 31c27,33
> < +  } else {
> ---
> > +  }
> > +  if ((file_data.dwFileAttributes == FILE_ATTRIBUTE_NORMAL) ||
> > +      // an archive file such as a jar file has the dwFileAttributes
> value of
> > +      // 0x2020 (FILE_ATTRIBUTE_NOT_CONTENT_INDEXED |
> FILE_ATTRIBUTE_ARCHIVE)
> > +      // on Windows Server O/S but the value is 0x0020 on
> > +      // Windows desktop O/S such as Windows 7.
> > +      ((file_data.dwFileAttributes & FILE_ATTRIBUTE_ARCHIVE) != 0)) {
>
>
> I wonder about other special cases which should work too:
>
> - read-only files should be S_IFREG and !S_IWRITE,
>
> For a read-only system file under the user's home dir.
>
> st_mode & 0xFF00 = 0x8100 = S_IFREG | S_IREAD
> dwFileAttributes = 39 = (FILE_ATTRIBUTE_ARCHIVE | FILE_ATTRIBUTE_HIDDEN |
> FILE_ATTRIBUTE_SYSTEM | FILE_ATTRIBUTE_READONLY)
>
> read-only directories should be S_IFDIR and !S_IWRITE.
>
> I've tried the C:\progra~1 dir.
>
> st_mode & 0xFF00 = 0x4100 = S_IFDIR | S_IREAD
> dwFileAttributes = 17 = (FILE_ATTRIBUTE_DIRECTORY |
> FILE_ATTRIBUTE_READONLY)
>
> - We should tread the device root ("C:\") as a directory (so,
> os::stat("c:\") should return S_IFDIR). Does this work?
>
> This one works too.
>
> st_mode & 0xFF00 = 0x4100 = S_IFDIR | S_IREAD
> dwFileAttributes = 22 = (FILE_ATTRIBUTE_DIRECTORY | FILE_ATTRIBUTE_HIDDEN
> | FILE_ATTRIBUTE_SYSTEM)
>
>
>
>> I've also used GetTickCounts() to measure the performance of ::stat() vs
>> GetFileAttributesA() plus file_attribute_data_to_stat(). There's no
>> difference at the ms resolution. Using the high-resolution timestamp (
>> https://msdn.microsoft.com/en-us/library/windows/desktop/dn
>> 553408(v=vs.85).aspx), it doesn't show much difference.
>>
>>
> stat() seems to be implemented using FindFirstFile() (see crt sources if
> you can), and we call GetFileAttributesA(), so I do not think this differs
> much.
>
> Yes, I saw the same in my Visual Studio installation.
>
> thanks,
> Calvin
>
>
>
>>
>> But onward:
>>
>>
>>>
>>>> =========================
>>>>
>>>> src/hotspot/os/windows/os_windows.cpp
>>>>
>>>> Could you please use wchar_t instead of WCHAR?
>>>>
>>> Fixed.
>>>
>>>>
>>>> ---------------
>>>> in os::stat():
>>>>
>>>> This cumbersome malloc/strcpy sequence:
>>>>
>>>> !   size_t path_len = strlen(path) + 1;
>>>> +   char* pathbuf = (char*)os::malloc(path_len * sizeof(char),
>>>> mtInternal);
>>>> +   if (pathbuf == NULL) {
>>>> +     errno = ENOMEM;
>>>>       return -1;
>>>>     }
>>>>     os::native_path(strcpy(pathbuf, path));
>>>>
>>>>  can be reduced to a simple strdup:
>>>>
>>>> char* pathbuf = os::strdup(path, mtInternal);
>>>> if (pathbuf == NULL) {
>>>>   errno = ENOMEM;
>>>>   return -1;
>>>> }
>>>> os::native_path(pathbuf);
>>>>
>>>> This also would separate strcpy() from os::native_path() which is a bit
>>>> unreadable.
>>>>
>>> I've made the change.
>>>
>>>>
>>>> --------------------
>>>>
>>>> The single-byte path (strdup, ::stat()), together with its free(), can
>>>> be moved inside the (strlen(path) < MAX_PATH) condition. os::native_path
>>>> will not change the path length (it better not, as it operates on the input
>>>> buffer). That avoids  having two allocations on the doublebyte path.
>>>>
>>>
>>> os::native_path() converts a path like C:\\\\somedir to C:\\somedir
>>> So I'll need the converted path in the wchar_t case too. The freeing of
>>> the pathbuf is being done at the end of the function or in the middle of
>>> the wchar_t case if there's an error.
>>
>>
>> Oh, you are right,of course. I missed that it this is needed for both
>> paths.
>>
>>
>>>
>>>
>>>> -----------------------
>>>>
>>>> But seeing that translation to UNC paths is something we might want
>>>> more often, I would combine allocation and UNC prefix adding to one
>>>> function like this, to avoid the memove and increase reusability:
>>>>
>>>> // Creates an unc path from a single byte path. Return buffer is
>>>> allocated in C heap and needs to be freed by caller. Returns NULL on error.
>>>> static whchar_t* create_unc_path(const char* s) {
>>>>   - does s start with \\?\ ?
>>>> - yes:
>>>>             - os::malloc(strlen(s) + 1) and mbstowcs_s
>>>>         - no:
>>>>             - os::malloc(strlen(s) + 1 + 4), mbstowcs_s to fourth
>>>> position in string, prefix with L"\\?\"
>>>> }
>>>>
>>>
>>> I also include the case for adding  L"\\\\?\\UNC\0" at the beginning to
>>> be consistent with libjava/canonicalize_md.c.
>>
>>
>>>> We also need error checking to mbstowcs_s.
>>>>
>>> I've added assert like the following after the call:
>>>
>>> assert(converted_chars == path_len, "sanity");
>>
>>
>>
>> create_unc_path() :
>>
>> - could you convert the /**/ to // comments, please?
>>
>> Fixed.
>>
>
> thanks.
>
>
>>
>>
>> - not sure about the assert after mbstowcs_s: if we happen to encounter
>> an invalid multibyte character, function will fail and return an error:
>>
>> https://msdn.microsoft.com/en-us/library/eyktyxsx.aspx
>> "If mbstowcs_s encounters an invalid multibyte character, it puts 0 in
>> *``pReturnValue, sets the destination buffer to an empty string, sets errno
>> to EILSEQ, and returns EILSEQ."
>>
>> I've changed create_unc_path() so that the caller will get the errno and
>> removed the assert.
>>
>> static wchar_t* create_unc_path(const char* path, errno_t &err)
>>
>
> Okay, works for me.
>
>
>>
>>
>>  As this is dependent on user data, we should not assert, but handle the
>> return code of mbstowcs_s gracefully. Same goes for
>> libjava/canonicalize_md.c.
>>
>> - Here: ::mbstowcs_s(&converted_chars, &wpath[7], path_len + 7, path,
>> path_len);
>> third parameter is wrong, as we hand in an offset into the buffer, we
>> must decrement the buffer size by this offset, so correct would be path_len
>> +7 - 7 or just path_len.
>>
>> - Same error below: + ::mbstowcs_s(&converted_chars, &wpath[4], path_len
>> + 4, path, path_len);
>>
>> Fixed in both places.
>>
>
> Okay.
>
>
>>
>>
>>
>>
>>>
>>>
>>>> Just for cleanliness, I would then wrap deallocation into an own
>>>> function.
>>>>
>>>> static viud destroy_unc_path(whchar_t* s) { os::free(s); }
>>>>
>>> I've added the destroy function.
>>>
>>>>
>>>> These two functions could be candidates of putting into os::win32
>>>> namespace, as this form of ANSI->UNC path translation is quite common -
>>>> whoever wants to use the xxxxW() functions must do this.
>>>>
>>> I'm leaving them in os_windows.cpp since they're being used only within
>>> that file.
>>
>>
>> Fine by me.
>>
>>
>>>
>>>
>>>> -----------------------
>>>>
>>>> FindFirstFileW:
>>>>
>>>> I am pretty sure that you can achieve the same result with
>>>> GetFileAttributesExW(). It also returns WIN32_FIND_DATAW.
>>>>
>>> It actually returns WIN32_FILE_ATTRIBUTE_DATA and is very similar to
>>> WIN32_FIND_DATAW.
>>>
>>>>
>>>> It is way more straightforward to use than FindFirstFileW, as it does
>>>> not require you to write a callback hook.
>>>>
>>> I've switched to using GetFileAttributesExW().
>>
>>
>> Thank you, this is way more readable.
>>
>> Another issue: do we still need the fix for 6539723 if we switch from
>> stat() to GetFileAttributesExW()? This fix looks important, but the
>> comment
>> indicates that it could break things if the original bug is not present.
>>
>> Btw, this is another strong argument for scrapping ::stat() altogether on
>> all code paths, not only for long input paths, because stat() and
>> GetFileAttributesExW() may behave
>> differently. So it would be good to use the same API on all code paths,
>> in order to get the best test coverage.
>>
>> For this round of change, I'm using GetFileAttributesEx[A|W]() for both
>> code paths.
>>
>> webrev: http://cr.openjdk.java.net/~ccheung/8188122/webrev.03/
>>
>> thanks,
>> Calvin
>>
>
> Okay, all good apart from the issues mentioned above. Thanks for your work!
>
> Best Regards, Thomas
>
>
>
>>
>>
>>>
>>>> -------
>>>>
>>>> eval_find_data(): This is more of a generic helper function, could you
>>>> rename this to something clearer, e.g. make_double_word() ?
>>>>
>>> Ok. I've renamed it.
>>>
>>>>
>>>> Also, setup_stat(), could this be renamed more clearly to something
>>>> like WIN32_FIND_DATA_to_stat? or lowercase if this bothers you :)
>>>>
>>> I'm naming the function as file_attribute_data_to_stat() to match with
>>> the data structure name.
>>
>>
>> Thanks for taking my suggestions.
>>
>>
>>>
>>>
>>>> ==================================
>>>> src/hotspot/share/classfile/classLoader.cpp
>>>>
>>>> In ClassPathDirEntry::open_stream(), I would feel better if we were
>>>> asserting _dir and name to be != NULL before feeding it to strlen.
>>>>
>>> I've added an assert statement.
>>>
>>>>
>>>> ===================
>>>>
>>> Here's an updated webrev:
>>>     http://cr.openjdk.java.net/~ccheung/8188122/webrev.02/
>>>
>>>
>> Thanks!
>>
>> Thomas
>>
>>
>>> thanks,
>>> Calvin
>>>
>>>
>>>> Thanks!
>>>>
>>>> Thomas
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> On Wed, Oct 25, 2017 at 9:48 PM, Calvin Cheung <
>>>> [hidden email] <mailto:[hidden email]>> wrote:
>>>>
>>>>     I've reworked this fix by using the Unicode version of open
>>>>     (wopen) to handle path name longer than max path with the path
>>>>     prefixed to indicate an extended length path as described in [1].
>>>>
>>>>     The Unicode version of stat (wstat) doesn't work well with long
>>>>     path [2]. So FindFirstFileW will be used.The data in
>>>>     WIN32_FIND_DATA returned from FindFirstFileW needs to be
>>>>     transferred to the stat structure since the caller expects a
>>>>     return stat structure and other platforms return a stat structure.
>>>>
>>>>     In classLoader.cpp, calculate the size of buffer required instead
>>>>     of limiting it to JVM_MAXPATHLEN.
>>>>     In os_windows.cpp, dynamically allocate buffers in os::open and
>>>>     os::stat.
>>>>
>>>>     updated webrev:
>>>>     http://cr.openjdk.java.net/~ccheung/8188122/webrev.01/
>>>>     <http://cr.openjdk.java.net/%7Eccheung/8188122/webrev.01/>
>>>>
>>>>     It passed hs-tier2 testing using mach5.
>>>>
>>>>     thanks,
>>>>     Calvin
>>>>
>>>>     [1]
>>>>     https://msdn.microsoft.com/en-us/library/windows/desktop/aa3
>>>> 65247(v=vs.85).aspx#MAX_PATH
>>>>     <https://msdn.microsoft.com/en-us/library/windows/desktop/aa
>>>> 365247%28v=vs.85%29.aspx#MAX_PATH>
>>>>
>>>>     [2]
>>>>     https://social.msdn.microsoft.com/Forums/vstudio/en-US/3c093
>>>> ea9-f0aa-446d-b648-2dabe8480430/stat-and-long-names?forum=vcgeneral
>>>>     <https://social.msdn.microsoft.com/Forums/vstudio/en-US/3c09
>>>> 3ea9-f0aa-446d-b648-2dabe8480430/stat-and-long-names?forum=vcgeneral>
>>>>
>>>>
>>>>
>>>>     On 10/16/17, 3:15 PM, Calvin Cheung wrote:
>>>>
>>>>         JBS: https://bugs.openjdk.java.net/browse/JDK-8188122
>>>>         <https://bugs.openjdk.java.net/browse/JDK-8188122>
>>>>
>>>>         Adding a warning message if the full path or the directory
>>>>         length exceeds MAX_PATH on windows.
>>>>
>>>>         Example warning messages.
>>>>
>>>>         1) The full path exceeds MAX_PATH:
>>>>
>>>>         Java HotSpot(TM) 64-Bit Server VM warning: construct full path
>>>>         name failed: total length 270 exceeds max length of 260
>>>>             dir
>>>>         T:\\testoutput\\jtreg\\JTwork\\classes\\2\\runtime\\LoadClas
>>>> s\\LongPath.d\\xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
>>>> xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
>>>> xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
>>>> xxxxxxxxxxxxxxxxxxxxxxxxxxxx
>>>>         length 259
>>>>             name Hello.class length 11
>>>>
>>>>         2) The directory path exceeds MAX_PATH:
>>>>
>>>>         Java HotSpot(TM) 64-Bit Server VM warning: os::stat failed:
>>>>         path length 265 exceeds max length 260
>>>>             path
>>>>         T:\\testoutput\\jtreg\\JTwork\\classes\\2\\runtime\\LoadClas
>>>> s\\LongPath.d\\xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
>>>> xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
>>>> xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
>>>> xxxxxxxxxxxxxxxxxxxxxxxxxxxx\\xxxxx
>>>>
>>>>         webrev:
>>>>         http://cr.openjdk.java.net/~ccheung/8188122/webrev.00/
>>>>         <http://cr.openjdk.java.net/%7Eccheung/8188122/webrev.00/>
>>>>
>>>>         Testing:
>>>>             JPRT (including the new test)
>>>>
>>>>         thanks,
>>>>         Calvin
>>>>
>>>>
>>>>
>>
>
Reply | Threaded
Open this post in threaded view
|

Re: RFR(S): 8188122: Path length limits on Windows leads to obscure class loading failures

Calvin Cheung
Hi Thomas,

On 11/8/17, 10:40 PM, Thomas Stüfe wrote:

> Hi Calvin,
>
> On Wed, Nov 8, 2017 at 6:27 PM, Calvin Cheung
> <[hidden email] <mailto:[hidden email]>> wrote:
>
>
>
>     On 11/7/17, 6:12 AM, Thomas Stüfe wrote:
>>     Hi Calvin,
>>
>>     On Wed, Nov 1, 2017 at 8:07 PM, Calvin Cheung
>>     <[hidden email] <mailto:[hidden email]>> wrote:
>>
>>
>>
>>         On 10/27/17, 12:55 AM, Thomas Stüfe wrote:
>>>         Hi Calvin,
>>>
>>>         On Fri, Oct 27, 2017 at 2:03 AM, Calvin Cheung
>>>         <[hidden email] <mailto:[hidden email]>>
>>>         wrote:
>>>
>>>             Hi Thomas,
>>>
>>>             On 10/25/17, 11:54 PM, Thomas Stüfe wrote:
>>>
>>>                 Hi Calvin,
>>>
>>>                 this is a worthwhile addition, thank you for your work!
>>>
>>>             Thanks for your review.
>>>
>>>
>>>         Thanks for your work :)
>>>
>>>         First off, there is another issue in
>>>         file_attribute_data_to_stat(). We actually had this issue
>>>         before, but your work uncovered it:
>>>
>>>         According to POSIX
>>>         (http://pubs.opengroup.org/onlinepubs/009695399/basedefs/sys/stat.h.html
>>>         <http://pubs.opengroup.org/onlinepubs/009695399/basedefs/sys/stat.h.html>)
>>>         and every unix manpage I looked at (solaris, linux, aix..),
>>>         st_ctime is not the file creation time but the last time
>>>         file status was changed. Only Microsoft gets this wrong in
>>>         their posix layer, its stat() function returns
>>>          (https://msdn.microsoft.com/en-us/library/14h5k7ff.aspx
>>>         <https://msdn.microsoft.com/en-us/library/14h5k7ff.aspx>)
>>>         creation time in st_ctime.
>>>
>>>         But as os::stat() is a platform independent layer, I'd say
>>>         we should behave the same on all platforms, and that would
>>>         be the Posix way.
>>>
>>>         I did not find any code calling os::stat() and using
>>>         st_ctime, so this is still save to change for windows. (Note
>>>         that I found that perfMemory_xxx.cpp uses plain OS ::stat
>>>         and misuses ctime as "creation time" - I opened a bug for
>>>         that (https://bugs.openjdk.java.net/browse/JDK-8190260
>>>         <https://bugs.openjdk.java.net/browse/JDK-8190260> - but
>>>         that does not affect us, as they do not call os::stat()).
>>>
>>>         There is one more complication: in os::stat() we use plain
>>>         ::stat() in the single byte case.:
>>>
>>>         *+ if (strlen(path) < MAX_PATH) {*
>>>         *+     ret = ::stat(pathbuf, sbuf);*
>>>         *+   } else {*
>>>         *
>>>         *
>>>         But ::stat() on Windows is broken, as I wrote above. We
>>>         should not use it, especially not if we change the meaing of
>>>         st_ctime in the double byte path.
>>>
>>>         My suggestion would be to just always call
>>>         GetFileAttributes(), also for the single byte path. A very
>>>         simple solution would be to just always go the double byte
>>>         path with UNC translation, regardless of the path
>>>         length. Or, if you are worried about performance, for paths
>>>         shorter than MAX_PATH we use GetFileAttributesA(), for
>>>         longer paths GetFileAttributesW with UNC translation. In
>>>         both cases you get a WIN32_FILE_ATTRIBUTE_DATA which you can
>>>         feed tp your  file_attribute_data_to_stat() routine and have
>>>         the struct stat you return be always consistent.
>>         I ran into an issue with the dwFileAttributes value for a jar
>>         file. On Windows Server O/S, the value is set to 0x2020 which
>>         is (FILE_ATTRIBUTE_NOT_CONTENT_INDEXED |
>>         FILE_ATTRIBUTE_ARCHIVE) but on a desktop O/S like Windows 7,
>>         it is set to 0x0020 which is just FILE_ATTRIBUTE_ARCHIVE.
>>         I've fixed it in file_attribute_data_to_stat().
>>
>>
>>     Oh.. :( good catch! But that opens a new can of worms I did not
>>     see before:
>>
>>     FILE_ATTRIBUTE_NORMAL is documented as "A file that does not have
>>     other attributes set. This attribute is valid only when used
>>     alone." In addition to this flag, we have a multitude of things
>>     like FILE_ATTRIBUTE_ARCHIVE, FILE_ATTRIBUTE_ENCRYPTED,
>>     FILE_ATTRIBUTE_READONLY ... etc, all cases where we assume this
>>     is a normal file in Unix terminology and where we would expect
>>     os::stat to return S_IFREG, but where according to the msdn doc
>>     FILE_ATTRIBUTE_NORMAL is not set.
>>
>>     Looking at what others do in this scenario (Looked at mingw
>>     sources and at ReactOS - I am not posting any source code here
>>     for legal reasons but feel free to look for yourself), the usual
>>     way to translate WIN32_FILE_ATTRIBUTES to struct stat seems to be:
>>     "if FILE_ATTRIBUTE_DIRECTORY then set S_IFDIR else S_IFREG" (so,
>>     no dependency on FILE_ATTRIBUTE_NORMAL).
>     This makes sense but I ran into similar problem as before - the
>     dwFileAttributes has a different value on a windows server O/S vs
>     desktop O/S. So I need to do the check as follows:
>
>     +   // A directory has the dwFileAttributes value of 0x2010 which is
>     +   // (FILE_ATTRIBUTE_NOT_CONTENT_INDEXED | FILE_ATTRIBUTE_ARCHIVE)
>     +   // on Windows Server O/S but the value is 0x0010 on Windows desktop O/S
>     +   // such as Windows 7.
>     +   if ((file_data.dwFileAttributes&  FILE_ATTRIBUTE_DIRECTORY) != 0) {
>     +     sbuf->st_mode |= S_IFDIR;
>     +   } else {
>     +     sbuf->st_mode |= S_IFREG;
>     +   }
>
> I scratched my head a bit about the comment till I understood that you
> mean "if FILE_ATTRIBUTE_DIRECTORY bit is set it is a directory
> regardless of which other flags are set" instead of "if
> flags==FILE_ATTRIBUTE_DIRECTORY it is a directory". Sure, I guess my
> comment above was sloppy, but this was what I meant. I am not even
> sure the comment is needed, this is quite self-explaining.
I've noticed a typo in the above comment:
+   // (FILE_ATTRIBUTE_NOT_CONTENT_INDEXED | FILE_ATTRIBUTE_ARCHIVE)

FILE_ATTRIBUTE_ARCHIVE should be FILE_ATTRIBUTE_DIRECTORY

I'll correct it before push.

>
>     updated webrev:
>     http://cr.openjdk.java.net/~ccheung/8188122/webrev.04/
>     <http://cr.openjdk.java.net/%7Eccheung/8188122/webrev.04/>
>
>
> I am fine with all the changes now. Thank you for your work and patience.
Thanks for your discussions and review.

thanks,
Calvin

>
> Kind Regards, Thomas
>
>
>
>
>     Diff from webrev.03:
>
>     < --- old/src/hotspot/os/windows/os_windows.cpp 2017-11-08
>     08:50:40.170786397 -0800
>     < +++ new/src/hotspot/os/windows/os_windows.cpp 2017-11-08
>     08:50:39.803751877 -0800
>     < @@ -4060,41 +4060,119 @@
>     ---
>     > --- old/src/hotspot/os/windows/os_windows.cpp 2017-11-01
>     09:40:13.657460425 -0700
>     > +++ new/src/hotspot/os/windows/os_windows.cpp 2017-11-01
>     09:40:13.261423192 -0700
>     > @@ -4060,41 +4060,121 @@
>     25,29c25
>     < +  // A directory has the dwFileAttributes value of 0x2010 which is
>     < +  // (FILE_ATTRIBUTE_NOT_CONTENT_INDEXED | FILE_ATTRIBUTE_ARCHIVE)
>     < +  // on Windows Server O/S but the value is 0x0010 on Windows
>     desktop O/S
>     < +  // such as Windows 7.
>     < +  if ((file_data.dwFileAttributes & FILE_ATTRIBUTE_DIRECTORY)
>     != 0) {
>     ---
>     > +  if (file_data.dwFileAttributes == FILE_ATTRIBUTE_DIRECTORY) {
>     31c27,33
>     < +  } else {
>     ---
>     > +  }
>     > +  if ((file_data.dwFileAttributes == FILE_ATTRIBUTE_NORMAL) ||
>     > +      // an archive file such as a jar file has the
>     dwFileAttributes value of
>     > +      // 0x2020 (FILE_ATTRIBUTE_NOT_CONTENT_INDEXED |
>     FILE_ATTRIBUTE_ARCHIVE)
>     > +      // on Windows Server O/S but the value is 0x0020 on
>     > +      // Windows desktop O/S such as Windows 7.
>     > +      ((file_data.dwFileAttributes & FILE_ATTRIBUTE_ARCHIVE) !=
>     0)) {
>
>>
>>     I wonder about other special cases which should work too:
>>     - read-only files should be S_IFREG and !S_IWRITE,
>     For a read-only system file under the user's home dir.
>
>     st_mode & 0xFF00 = 0x8100 = S_IFREG | S_IREAD
>     dwFileAttributes = 39 = (FILE_ATTRIBUTE_ARCHIVE |
>     FILE_ATTRIBUTE_HIDDEN | FILE_ATTRIBUTE_SYSTEM |
>     FILE_ATTRIBUTE_READONLY)
>>     read-only directories should be S_IFDIR and !S_IWRITE.
>     I've tried the C:\progra~1 dir.
>
>     st_mode & 0xFF00 = 0x4100 = S_IFDIR | S_IREAD
>     dwFileAttributes = 17 = (FILE_ATTRIBUTE_DIRECTORY |
>     FILE_ATTRIBUTE_READONLY)
>>     - We should tread the device root ("C:\") as a directory (so,
>>     os::stat("c:\") should return S_IFDIR). Does this work?
>     This one works too.
>
>     st_mode & 0xFF00 = 0x4100 = S_IFDIR | S_IREAD
>     dwFileAttributes = 22 = (FILE_ATTRIBUTE_DIRECTORY |
>     FILE_ATTRIBUTE_HIDDEN | FILE_ATTRIBUTE_SYSTEM)
>>
>>         I've also used GetTickCounts() to measure the performance of
>>         ::stat() vs GetFileAttributesA() plus
>>         file_attribute_data_to_stat(). There's no difference at the
>>         ms resolution. Using the high-resolution timestamp
>>         (https://msdn.microsoft.com/en-us/library/windows/desktop/dn553408(v=vs.85).aspx)
>>         <https://msdn.microsoft.com/en-us/library/windows/desktop/dn553408%28v=vs.85%29.aspx%29>,
>>         it doesn't show much difference.
>>
>>
>>     stat() seems to be implemented using FindFirstFile() (see crt
>>     sources if you can), and we call GetFileAttributesA(), so I do
>>     not think this differs much.
>     Yes, I saw the same in my Visual Studio installation.
>
>     thanks,
>     Calvin
>
>>>
>>>         But onward:
>>>
>>>
>>>
>>>                 =========================
>>>
>>>                 src/hotspot/os/windows/os_windows.cpp
>>>
>>>                 Could you please use wchar_t instead of WCHAR?
>>>
>>>             Fixed.
>>>
>>>
>>>                 ---------------
>>>                 in os::stat():
>>>
>>>                 This cumbersome malloc/strcpy sequence:
>>>
>>>                 !   size_t path_len = strlen(path) + 1;
>>>                 +   char* pathbuf = (char*)os::malloc(path_len *
>>>                 sizeof(char), mtInternal);
>>>                 +   if (pathbuf == NULL) {
>>>                 +     errno = ENOMEM;
>>>                       return -1;
>>>                     }
>>>                     os::native_path(strcpy(pathbuf, path));
>>>
>>>                  can be reduced to a simple strdup:
>>>
>>>                 char* pathbuf = os::strdup(path, mtInternal);
>>>                 if (pathbuf == NULL) {
>>>                   errno = ENOMEM;
>>>                   return -1;
>>>                 }
>>>                 os::native_path(pathbuf);
>>>
>>>                 This also would separate strcpy() from
>>>                 os::native_path() which is a bit unreadable.
>>>
>>>             I've made the change.
>>>
>>>
>>>                 --------------------
>>>
>>>                 The single-byte path (strdup, ::stat()), together
>>>                 with its free(), can be moved inside the
>>>                 (strlen(path) < MAX_PATH) condition. os::native_path
>>>                 will not change the path length (it better not, as
>>>                 it operates on the input buffer). That avoids
>>>                 having two allocations on the doublebyte path.
>>>
>>>
>>>             os::native_path() converts a path like C:\\\\somedir to
>>>             C:\\somedir
>>>             So I'll need the converted path in the wchar_t case too.
>>>             The freeing of the pathbuf is being done at the end of
>>>             the function or in the middle of the wchar_t case if
>>>             there's an error.
>>>
>>>
>>>         Oh, you are right,of course. I missed that it this is needed
>>>         for both paths.
>>>
>>>
>>>
>>>                 -----------------------
>>>
>>>                 But seeing that translation to UNC paths is
>>>                 something we might want more often, I would combine
>>>                 allocation and UNC prefix adding to one function
>>>                 like this, to avoid the memove and increase reusability:
>>>
>>>                 // Creates an unc path from a single byte path.
>>>                 Return buffer is allocated in C heap and needs to be
>>>                 freed by caller. Returns NULL on error.
>>>                 static whchar_t* create_unc_path(const char* s) {
>>>                   - does s start with \\?\ ?
>>>                 - yes:
>>>                             - os::malloc(strlen(s) + 1) and mbstowcs_s
>>>                         - no:
>>>                             - os::malloc(strlen(s) + 1 + 4),
>>>                 mbstowcs_s to fourth position in string, prefix with
>>>                 L"\\?\"
>>>                 }
>>>
>>>
>>>             I also include the case for adding  L"\\\\?\\UNC\0" at
>>>             the beginning to be consistent with
>>>             libjava/canonicalize_md.c.
>>>
>>>
>>>                 We also need error checking to mbstowcs_s.
>>>
>>>             I've added assert like the following after the call:
>>>
>>>             assert(converted_chars == path_len, "sanity");
>>>
>>>
>>>
>>>         create_unc_path() :
>>>
>>>         - could you convert the /**/ to // comments, please?
>>         Fixed.
>>
>>
>>     thanks.
>>
>>
>>>
>>>         - not sure about the assert after mbstowcs_s: if we happen
>>>         to encounter an invalid multibyte character, function will
>>>         fail and return an error:
>>>
>>>         https://msdn.microsoft.com/en-us/library/eyktyxsx.aspx
>>>         <https://msdn.microsoft.com/en-us/library/eyktyxsx.aspx>
>>>         "If mbstowcs_s encounters an invalid multibyte character, it
>>>         puts 0 in *``pReturnValue, sets the destination buffer to an
>>>         empty string, sets errno to EILSEQ, and returns EILSEQ."
>>         I've changed create_unc_path() so that the caller will get
>>         the errno and removed the assert.
>>
>>         static wchar_t* create_unc_path(const char* path, errno_t &err)
>>
>>
>>     Okay, works for me.
>>
>>
>>>
>>>          As this is dependent on user data, we should not assert,
>>>         but handle the return code of mbstowcs_s gracefully. Same
>>>         goes for libjava/canonicalize_md.c.
>>>
>>>         - Here: ::mbstowcs_s(&converted_chars, &wpath[7], path_len +
>>>         7, path, path_len);
>>>         third parameter is wrong, as we hand in an offset into the
>>>         buffer, we must decrement the buffer size by this offset, so
>>>         correct would be path_len +7 - 7 or just path_len.
>>>
>>>         - Same error below: + ::mbstowcs_s(&converted_chars,
>>>         &wpath[4], path_len + 4, path, path_len);
>>         Fixed in both places.
>>
>>
>>     Okay.
>>
>>
>>>
>>>
>>>                 Just for cleanliness, I would then wrap deallocation
>>>                 into an own function.
>>>
>>>                 static viud destroy_unc_path(whchar_t* s) {
>>>                 os::free(s); }
>>>
>>>             I've added the destroy function.
>>>
>>>
>>>                 These two functions could be candidates of putting
>>>                 into os::win32 namespace, as this form of ANSI->UNC
>>>                 path translation is quite common - whoever wants to
>>>                 use the xxxxW() functions must do this.
>>>
>>>             I'm leaving them in os_windows.cpp since they're being
>>>             used only within that file.
>>>
>>>
>>>         Fine by me.
>>>
>>>
>>>
>>>                 -----------------------
>>>
>>>                 FindFirstFileW:
>>>
>>>                 I am pretty sure that you can achieve the same
>>>                 result with GetFileAttributesExW(). It also returns
>>>                 WIN32_FIND_DATAW.
>>>
>>>             It actually returns WIN32_FILE_ATTRIBUTE_DATA and is
>>>             very similar to WIN32_FIND_DATAW.
>>>
>>>
>>>                 It is way more straightforward to use than
>>>                 FindFirstFileW, as it does not require you to write
>>>                 a callback hook.
>>>
>>>             I've switched to using GetFileAttributesExW().
>>>
>>>
>>>         Thank you, this is way more readable.
>>>         Another issue: do we still need the fix for 6539723 if we
>>>         switch from stat() to GetFileAttributesExW()? This fix looks
>>>         important, but the comment
>>>         indicates that it could break things if the original bug is
>>>         not present.
>>>
>>>         Btw, this is another strong argument for scrapping ::stat()
>>>         altogether on all code paths, not only for long input paths,
>>>         because stat() and GetFileAttributesExW() may behave
>>>         differently. So it would be good to use the same API on all
>>>         code paths, in order to get the best test coverage.
>>         For this round of change, I'm using
>>         GetFileAttributesEx[A|W]() for both code paths.
>>
>>         webrev:
>>         http://cr.openjdk.java.net/~ccheung/8188122/webrev.03/
>>         <http://cr.openjdk.java.net/%7Eccheung/8188122/webrev.03/>
>>
>>         thanks,
>>         Calvin
>>
>>
>>     Okay, all good apart from the issues mentioned above. Thanks for
>>     your work!
>>
>>     Best Regards, Thomas
>>
>>
>>
>>>
>>>
>>>
>>>                 -------
>>>
>>>                 eval_find_data(): This is more of a generic helper
>>>                 function, could you rename this to something
>>>                 clearer, e.g. make_double_word() ?
>>>
>>>             Ok. I've renamed it.
>>>
>>>
>>>                 Also, setup_stat(), could this be renamed more
>>>                 clearly to something like WIN32_FIND_DATA_to_stat?
>>>                 or lowercase if this bothers you :)
>>>
>>>             I'm naming the function as file_attribute_data_to_stat()
>>>             to match with the data structure name.
>>>
>>>
>>>         Thanks for taking my suggestions.
>>>
>>>
>>>
>>>                 ==================================
>>>                 src/hotspot/share/classfile/classLoader.cpp
>>>
>>>                 In ClassPathDirEntry::open_stream(), I would feel
>>>                 better if we were asserting _dir and name to be !=
>>>                 NULL before feeding it to strlen.
>>>
>>>             I've added an assert statement.
>>>
>>>
>>>                 ===================
>>>
>>>             Here's an updated webrev:
>>>             http://cr.openjdk.java.net/~ccheung/8188122/webrev.02/
>>>             <http://cr.openjdk.java.net/%7Eccheung/8188122/webrev.02/>
>>>
>>>
>>>         Thanks!
>>>
>>>         Thomas
>>>
>>>             thanks,
>>>             Calvin
>>>
>>>
>>>                 Thanks!
>>>
>>>                 Thomas
>>>
>>>
>>>
>>>
>>>
>>>
>>>                 On Wed, Oct 25, 2017 at 9:48 PM, Calvin Cheung
>>>                 <[hidden email]
>>>                 <mailto:[hidden email]>
>>>                 <mailto:[hidden email]
>>>                 <mailto:[hidden email]>>> wrote:
>>>
>>>                     I've reworked this fix by using the Unicode
>>>                 version of open
>>>                     (wopen) to handle path name longer than max path
>>>                 with the path
>>>                     prefixed to indicate an extended length path as
>>>                 described in [1].
>>>
>>>                     The Unicode version of stat (wstat) doesn't work
>>>                 well with long
>>>                     path [2]. So FindFirstFileW will be used.The data in
>>>                     WIN32_FIND_DATA returned from FindFirstFileW
>>>                 needs to be
>>>                     transferred to the stat structure since the
>>>                 caller expects a
>>>                     return stat structure and other platforms return
>>>                 a stat structure.
>>>
>>>                     In classLoader.cpp, calculate the size of buffer
>>>                 required instead
>>>                     of limiting it to JVM_MAXPATHLEN.
>>>                     In os_windows.cpp, dynamically allocate buffers
>>>                 in os::open and
>>>                     os::stat.
>>>
>>>                     updated webrev:
>>>                 http://cr.openjdk.java.net/~ccheung/8188122/webrev.01/
>>>                 <http://cr.openjdk.java.net/%7Eccheung/8188122/webrev.01/>
>>>                 <http://cr.openjdk.java.net/%7Eccheung/8188122/webrev.01/
>>>                 <http://cr.openjdk.java.net/%7Eccheung/8188122/webrev.01/>>
>>>
>>>                     It passed hs-tier2 testing using mach5.
>>>
>>>                     thanks,
>>>                     Calvin
>>>
>>>                     [1]
>>>                 https://msdn.microsoft.com/en-us/library/windows/desktop/aa365247(v=vs.85).aspx#MAX_PATH
>>>                 <https://msdn.microsoft.com/en-us/library/windows/desktop/aa365247%28v=vs.85%29.aspx#MAX_PATH>
>>>                 <https://msdn.microsoft.com/en-us/library/windows/desktop/aa365247%28v=vs.85%29.aspx#MAX_PATH
>>>                 <https://msdn.microsoft.com/en-us/library/windows/desktop/aa365247%28v=vs.85%29.aspx#MAX_PATH>>
>>>
>>>                     [2]
>>>                 https://social.msdn.microsoft.com/Forums/vstudio/en-US/3c093ea9-f0aa-446d-b648-2dabe8480430/stat-and-long-names?forum=vcgeneral
>>>                 <https://social.msdn.microsoft.com/Forums/vstudio/en-US/3c093ea9-f0aa-446d-b648-2dabe8480430/stat-and-long-names?forum=vcgeneral>
>>>                 <https://social.msdn.microsoft.com/Forums/vstudio/en-US/3c093ea9-f0aa-446d-b648-2dabe8480430/stat-and-long-names?forum=vcgeneral
>>>                 <https://social.msdn.microsoft.com/Forums/vstudio/en-US/3c093ea9-f0aa-446d-b648-2dabe8480430/stat-and-long-names?forum=vcgeneral>>
>>>
>>>
>>>
>>>                     On 10/16/17, 3:15 PM, Calvin Cheung wrote:
>>>
>>>                         JBS:
>>>                 https://bugs.openjdk.java.net/browse/JDK-8188122
>>>                 <https://bugs.openjdk.java.net/browse/JDK-8188122>
>>>                 <https://bugs.openjdk.java.net/browse/JDK-8188122
>>>                 <https://bugs.openjdk.java.net/browse/JDK-8188122>>
>>>
>>>                         Adding a warning message if the full path or
>>>                 the directory
>>>                         length exceeds MAX_PATH on windows.
>>>
>>>                         Example warning messages.
>>>
>>>                         1) The full path exceeds MAX_PATH:
>>>
>>>                         Java HotSpot(TM) 64-Bit Server VM warning:
>>>                 construct full path
>>>                         name failed: total length 270 exceeds max
>>>                 length of 260
>>>                             dir
>>>                        
>>>                 T:\\testoutput\\jtreg\\JTwork\\classes\\2\\runtime\\LoadClass\\LongPath.d\\xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
>>>                         length 259
>>>                             name Hello.class length 11
>>>
>>>                         2) The directory path exceeds MAX_PATH:
>>>
>>>                         Java HotSpot(TM) 64-Bit Server VM warning:
>>>                 os::stat failed:
>>>                         path length 265 exceeds max length 260
>>>                             path
>>>                        
>>>                 T:\\testoutput\\jtreg\\JTwork\\classes\\2\\runtime\\LoadClass\\LongPath.d\\xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx\\xxxxx
>>>
>>>                         webrev:
>>>                 http://cr.openjdk.java.net/~ccheung/8188122/webrev.00/
>>>                 <http://cr.openjdk.java.net/%7Eccheung/8188122/webrev.00/>
>>>                 <http://cr.openjdk.java.net/%7Eccheung/8188122/webrev.00/
>>>                 <http://cr.openjdk.java.net/%7Eccheung/8188122/webrev.00/>>
>>>
>>>                         Testing:
>>>                             JPRT (including the new test)
>>>
>>>                         thanks,
>>>                         Calvin
>>>
>>>
>>>
>>
>
Reply | Threaded
Open this post in threaded view
|

Re: RFR(S): 8188122: Path length limits on Windows leads to obscure class loading failures

Calvin Cheung
I've had some off-list discussion with Ioi resulting in another update
to the webrev.

- added relative path scenario to the test. Currently, this fix doesn't
handle relative path on windows yet. The following RFE has been filed to
cover long relative paths on windows:
https://bugs.openjdk.java.net/browse/JDK-8191521
- renamed the test LongPath.java to LongBCP.java to better reflect what
is being tested;
- some comments update on os_windows.cpp

updated webrev:
     http://cr.openjdk.java.net/~ccheung/8188122/webrev.05/

thanks,
Calvin

On 11/9/17, 9:23 AM, Calvin Cheung wrote:

> Hi Thomas,
>
> On 11/8/17, 10:40 PM, Thomas Stüfe wrote:
>> Hi Calvin,
>>
>> On Wed, Nov 8, 2017 at 6:27 PM, Calvin Cheung
>> <[hidden email] <mailto:[hidden email]>> wrote:
>>
>>
>>
>>     On 11/7/17, 6:12 AM, Thomas Stüfe wrote:
>>>     Hi Calvin,
>>>
>>>     On Wed, Nov 1, 2017 at 8:07 PM, Calvin Cheung
>>>     <[hidden email] <mailto:[hidden email]>> wrote:
>>>
>>>
>>>
>>>         On 10/27/17, 12:55 AM, Thomas Stüfe wrote:
>>>>         Hi Calvin,
>>>>
>>>>         On Fri, Oct 27, 2017 at 2:03 AM, Calvin Cheung
>>>>         <[hidden email]
>>>>         <mailto:[hidden email]>> wrote:
>>>>
>>>>             Hi Thomas,
>>>>
>>>>             On 10/25/17, 11:54 PM, Thomas Stüfe wrote:
>>>>
>>>>                 Hi Calvin,
>>>>
>>>>                 this is a worthwhile addition, thank you for your work!
>>>>
>>>>             Thanks for your review.
>>>>
>>>>
>>>>         Thanks for your work :)
>>>>
>>>>         First off, there is another issue in
>>>>         file_attribute_data_to_stat(). We actually had this issue
>>>>         before, but your work uncovered it:
>>>>
>>>>         According to POSIX
>>>>         (http://pubs.opengroup.org/onlinepubs/009695399/basedefs/sys/stat.h.html
>>>>         <http://pubs.opengroup.org/onlinepubs/009695399/basedefs/sys/stat.h.html>)
>>>>         and every unix manpage I looked at (solaris, linux, aix..),
>>>>         st_ctime is not the file creation time but the last time
>>>>         file status was changed. Only Microsoft gets this wrong in
>>>>         their posix layer, its stat() function returns
>>>>          (https://msdn.microsoft.com/en-us/library/14h5k7ff.aspx
>>>>         <https://msdn.microsoft.com/en-us/library/14h5k7ff.aspx>)
>>>>         creation time in st_ctime.
>>>>
>>>>         But as os::stat() is a platform independent layer, I'd say
>>>>         we should behave the same on all platforms, and that would
>>>>         be the Posix way.
>>>>
>>>>         I did not find any code calling os::stat() and using
>>>>         st_ctime, so this is still save to change for windows.
>>>>         (Note that I found that perfMemory_xxx.cpp uses plain OS
>>>>         ::stat and misuses ctime as "creation time" - I opened a
>>>>         bug for that
>>>>         (https://bugs.openjdk.java.net/browse/JDK-8190260
>>>>         <https://bugs.openjdk.java.net/browse/JDK-8190260> - but
>>>>         that does not affect us, as they do not call os::stat()).
>>>>
>>>>         There is one more complication: in os::stat() we use plain
>>>>         ::stat() in the single byte case.:
>>>>
>>>>         *+ if (strlen(path) < MAX_PATH) {*
>>>>         *+     ret = ::stat(pathbuf, sbuf);*
>>>>         *+   } else {*
>>>>         *
>>>>         *
>>>>         But ::stat() on Windows is broken, as I wrote above. We
>>>>         should not use it, especially not if we change the meaing
>>>>         of st_ctime in the double byte path.
>>>>
>>>>         My suggestion would be to just always call
>>>>         GetFileAttributes(), also for the single byte path. A very
>>>>         simple solution would be to just always go the double byte
>>>>         path with UNC translation, regardless of the path
>>>>         length. Or, if you are worried about performance, for paths
>>>>         shorter than MAX_PATH we use GetFileAttributesA(), for
>>>>         longer paths GetFileAttributesW with UNC translation. In
>>>>         both cases you get a WIN32_FILE_ATTRIBUTE_DATA which you
>>>>         can feed tp your  file_attribute_data_to_stat() routine and
>>>>         have the struct stat you return be always consistent.
>>>         I ran into an issue with the dwFileAttributes value for a
>>>         jar file. On Windows Server O/S, the value is set to 0x2020
>>>         which is (FILE_ATTRIBUTE_NOT_CONTENT_INDEXED |
>>>         FILE_ATTRIBUTE_ARCHIVE) but on a desktop O/S like Windows 7,
>>>         it is set to 0x0020 which is just FILE_ATTRIBUTE_ARCHIVE.
>>>         I've fixed it in file_attribute_data_to_stat().
>>>
>>>
>>>     Oh.. :( good catch! But that opens a new can of worms I did not
>>>     see before:
>>>
>>>     FILE_ATTRIBUTE_NORMAL is documented as "A file that does not
>>>     have other attributes set. This attribute is valid only when
>>>     used alone." In addition to this flag, we have a multitude of
>>>     things like FILE_ATTRIBUTE_ARCHIVE, FILE_ATTRIBUTE_ENCRYPTED,
>>>     FILE_ATTRIBUTE_READONLY ... etc, all cases where we assume this
>>>     is a normal file in Unix terminology and where we would expect
>>>     os::stat to return S_IFREG, but where according to the msdn doc
>>>     FILE_ATTRIBUTE_NORMAL is not set.
>>>
>>>     Looking at what others do in this scenario (Looked at mingw
>>>     sources and at ReactOS - I am not posting any source code here
>>>     for legal reasons but feel free to look for yourself), the usual
>>>     way to translate WIN32_FILE_ATTRIBUTES to struct stat seems to be:
>>>     "if FILE_ATTRIBUTE_DIRECTORY then set S_IFDIR else S_IFREG" (so,
>>>     no dependency on FILE_ATTRIBUTE_NORMAL).
>>     This makes sense but I ran into similar problem as before - the
>>     dwFileAttributes has a different value on a windows server O/S vs
>>     desktop O/S. So I need to do the check as follows:
>>
>>     +   // A directory has the dwFileAttributes value of 0x2010 which is
>>     +   // (FILE_ATTRIBUTE_NOT_CONTENT_INDEXED | FILE_ATTRIBUTE_ARCHIVE)
>>     +   // on Windows Server O/S but the value is 0x0010 on Windows desktop O/S
>>     +   // such as Windows 7.
>>     +   if ((file_data.dwFileAttributes&  FILE_ATTRIBUTE_DIRECTORY) != 0) {
>>     +     sbuf->st_mode |= S_IFDIR;
>>     +   } else {
>>     +     sbuf->st_mode |= S_IFREG;
>>     +   }
>>
>> I scratched my head a bit about the comment till I understood that
>> you mean "if FILE_ATTRIBUTE_DIRECTORY bit is set it is a directory
>> regardless of which other flags are set" instead of "if
>> flags==FILE_ATTRIBUTE_DIRECTORY it is a directory". Sure, I guess my
>> comment above was sloppy, but this was what I meant. I am not even
>> sure the comment is needed, this is quite self-explaining.
> I've noticed a typo in the above comment:
> +   // (FILE_ATTRIBUTE_NOT_CONTENT_INDEXED | FILE_ATTRIBUTE_ARCHIVE)
>
> FILE_ATTRIBUTE_ARCHIVE should be FILE_ATTRIBUTE_DIRECTORY
>
> I'll correct it before push.
>
>>
>>     updated webrev:
>>     http://cr.openjdk.java.net/~ccheung/8188122/webrev.04/
>>     <http://cr.openjdk.java.net/%7Eccheung/8188122/webrev.04/>
>>
>>
>> I am fine with all the changes now. Thank you for your work and patience.
> Thanks for your discussions and review.
>
> thanks,
> Calvin
>>
>> Kind Regards, Thomas
>>
>>
>>
>>
>>     Diff from webrev.03:
>>
>>     < --- old/src/hotspot/os/windows/os_windows.cpp 2017-11-08
>>     08:50:40.170786397 -0800
>>     < +++ new/src/hotspot/os/windows/os_windows.cpp 2017-11-08
>>     08:50:39.803751877 -0800
>>     < @@ -4060,41 +4060,119 @@
>>     ---
>>     > --- old/src/hotspot/os/windows/os_windows.cpp 2017-11-01
>>     09:40:13.657460425 -0700
>>     > +++ new/src/hotspot/os/windows/os_windows.cpp 2017-11-01
>>     09:40:13.261423192 -0700
>>     > @@ -4060,41 +4060,121 @@
>>     25,29c25
>>     < +  // A directory has the dwFileAttributes value of 0x2010 which is
>>     < +  // (FILE_ATTRIBUTE_NOT_CONTENT_INDEXED | FILE_ATTRIBUTE_ARCHIVE)
>>     < +  // on Windows Server O/S but the value is 0x0010 on Windows
>>     desktop O/S
>>     < +  // such as Windows 7.
>>     < +  if ((file_data.dwFileAttributes & FILE_ATTRIBUTE_DIRECTORY)
>>     != 0) {
>>     ---
>>     > +  if (file_data.dwFileAttributes == FILE_ATTRIBUTE_DIRECTORY) {
>>     31c27,33
>>     < +  } else {
>>     ---
>>     > +  }
>>     > +  if ((file_data.dwFileAttributes == FILE_ATTRIBUTE_NORMAL) ||
>>     > +      // an archive file such as a jar file has the
>>     dwFileAttributes value of
>>     > +      // 0x2020 (FILE_ATTRIBUTE_NOT_CONTENT_INDEXED |
>>     FILE_ATTRIBUTE_ARCHIVE)
>>     > +      // on Windows Server O/S but the value is 0x0020 on
>>     > +      // Windows desktop O/S such as Windows 7.
>>     > +      ((file_data.dwFileAttributes & FILE_ATTRIBUTE_ARCHIVE)
>>     != 0)) {
>>
>>>
>>>     I wonder about other special cases which should work too:
>>>     - read-only files should be S_IFREG and !S_IWRITE,
>>     For a read-only system file under the user's home dir.
>>
>>     st_mode & 0xFF00 = 0x8100 = S_IFREG | S_IREAD
>>     dwFileAttributes = 39 = (FILE_ATTRIBUTE_ARCHIVE |
>>     FILE_ATTRIBUTE_HIDDEN | FILE_ATTRIBUTE_SYSTEM |
>>     FILE_ATTRIBUTE_READONLY)
>>>     read-only directories should be S_IFDIR and !S_IWRITE.
>>     I've tried the C:\progra~1 dir.
>>
>>     st_mode & 0xFF00 = 0x4100 = S_IFDIR | S_IREAD
>>     dwFileAttributes = 17 = (FILE_ATTRIBUTE_DIRECTORY |
>>     FILE_ATTRIBUTE_READONLY)
>>>     - We should tread the device root ("C:\") as a directory (so,
>>>     os::stat("c:\") should return S_IFDIR). Does this work?
>>     This one works too.
>>
>>     st_mode & 0xFF00 = 0x4100 = S_IFDIR | S_IREAD
>>     dwFileAttributes = 22 = (FILE_ATTRIBUTE_DIRECTORY |
>>     FILE_ATTRIBUTE_HIDDEN | FILE_ATTRIBUTE_SYSTEM)
>>>
>>>         I've also used GetTickCounts() to measure the performance of
>>>         ::stat() vs GetFileAttributesA() plus
>>>         file_attribute_data_to_stat(). There's no difference at the
>>>         ms resolution. Using the high-resolution timestamp
>>>         (https://msdn.microsoft.com/en-us/library/windows/desktop/dn553408(v=vs.85).aspx)
>>>         <https://msdn.microsoft.com/en-us/library/windows/desktop/dn553408%28v=vs.85%29.aspx%29>,
>>>         it doesn't show much difference.
>>>
>>>
>>>     stat() seems to be implemented using FindFirstFile() (see crt
>>>     sources if you can), and we call GetFileAttributesA(), so I do
>>>     not think this differs much.
>>     Yes, I saw the same in my Visual Studio installation.
>>
>>     thanks,
>>     Calvin
>>
>>>>
>>>>         But onward:
>>>>
>>>>
>>>>
>>>>                 =========================
>>>>
>>>>                 src/hotspot/os/windows/os_windows.cpp
>>>>
>>>>                 Could you please use wchar_t instead of WCHAR?
>>>>
>>>>             Fixed.
>>>>
>>>>
>>>>                 ---------------
>>>>                 in os::stat():
>>>>
>>>>                 This cumbersome malloc/strcpy sequence:
>>>>
>>>>                 !   size_t path_len = strlen(path) + 1;
>>>>                 +   char* pathbuf = (char*)os::malloc(path_len *
>>>>                 sizeof(char), mtInternal);
>>>>                 +   if (pathbuf == NULL) {
>>>>                 +     errno = ENOMEM;
>>>>                       return -1;
>>>>                     }
>>>>                     os::native_path(strcpy(pathbuf, path));
>>>>
>>>>                  can be reduced to a simple strdup:
>>>>
>>>>                 char* pathbuf = os::strdup(path, mtInternal);
>>>>                 if (pathbuf == NULL) {
>>>>                   errno = ENOMEM;
>>>>                   return -1;
>>>>                 }
>>>>                 os::native_path(pathbuf);
>>>>
>>>>                 This also would separate strcpy() from
>>>>                 os::native_path() which is a bit unreadable.
>>>>
>>>>             I've made the change.
>>>>
>>>>
>>>>                 --------------------
>>>>
>>>>                 The single-byte path (strdup, ::stat()), together
>>>>                 with its free(), can be moved inside the
>>>>                 (strlen(path) < MAX_PATH) condition.
>>>>                 os::native_path will not change the path length (it
>>>>                 better not, as it operates on the input buffer).
>>>>                 That avoids  having two allocations on the
>>>>                 doublebyte path.
>>>>
>>>>
>>>>             os::native_path() converts a path like C:\\\\somedir to
>>>>             C:\\somedir
>>>>             So I'll need the converted path in the wchar_t case
>>>>             too. The freeing of the pathbuf is being done at the
>>>>             end of the function or in the middle of the wchar_t
>>>>             case if there's an error.
>>>>
>>>>
>>>>         Oh, you are right,of course. I missed that it this is
>>>>         needed for both paths.
>>>>
>>>>
>>>>
>>>>                 -----------------------
>>>>
>>>>                 But seeing that translation to UNC paths is
>>>>                 something we might want more often, I would combine
>>>>                 allocation and UNC prefix adding to one function
>>>>                 like this, to avoid the memove and increase
>>>>                 reusability:
>>>>
>>>>                 // Creates an unc path from a single byte path.
>>>>                 Return buffer is allocated in C heap and needs to
>>>>                 be freed by caller. Returns NULL on error.
>>>>                 static whchar_t* create_unc_path(const char* s) {
>>>>                   - does s start with \\?\ ?
>>>>                 - yes:
>>>>                             - os::malloc(strlen(s) + 1) and mbstowcs_s
>>>>                         - no:
>>>>                             - os::malloc(strlen(s) + 1 + 4),
>>>>                 mbstowcs_s to fourth position in string, prefix
>>>>                 with L"\\?\"
>>>>                 }
>>>>
>>>>
>>>>             I also include the case for adding  L"\\\\?\\UNC\0" at
>>>>             the beginning to be consistent with
>>>>             libjava/canonicalize_md.c.
>>>>
>>>>
>>>>                 We also need error checking to mbstowcs_s.
>>>>
>>>>             I've added assert like the following after the call:
>>>>
>>>>             assert(converted_chars == path_len, "sanity");
>>>>
>>>>
>>>>
>>>>         create_unc_path() :
>>>>
>>>>         - could you convert the /**/ to // comments, please?
>>>         Fixed.
>>>
>>>
>>>     thanks.
>>>
>>>
>>>>
>>>>         - not sure about the assert after mbstowcs_s: if we happen
>>>>         to encounter an invalid multibyte character, function will
>>>>         fail and return an error:
>>>>
>>>>         https://msdn.microsoft.com/en-us/library/eyktyxsx.aspx
>>>>         <https://msdn.microsoft.com/en-us/library/eyktyxsx.aspx>
>>>>         "If mbstowcs_s encounters an invalid multibyte character,
>>>>         it puts 0 in *``pReturnValue, sets the destination buffer
>>>>         to an empty string, sets errno to EILSEQ, and returns EILSEQ."
>>>         I've changed create_unc_path() so that the caller will get
>>>         the errno and removed the assert.
>>>
>>>         static wchar_t* create_unc_path(const char* path, errno_t &err)
>>>
>>>
>>>     Okay, works for me.
>>>
>>>
>>>>
>>>>          As this is dependent on user data, we should not assert,
>>>>         but handle the return code of mbstowcs_s gracefully. Same
>>>>         goes for libjava/canonicalize_md.c.
>>>>
>>>>         - Here: ::mbstowcs_s(&converted_chars, &wpath[7], path_len
>>>>         + 7, path, path_len);
>>>>         third parameter is wrong, as we hand in an offset into the
>>>>         buffer, we must decrement the buffer size by this offset,
>>>>         so correct would be path_len +7 - 7 or just path_len.
>>>>
>>>>         - Same error below: + ::mbstowcs_s(&converted_chars,
>>>>         &wpath[4], path_len + 4, path, path_len);
>>>         Fixed in both places.
>>>
>>>
>>>     Okay.
>>>
>>>
>>>>
>>>>
>>>>                 Just for cleanliness, I would then wrap
>>>>                 deallocation into an own function.
>>>>
>>>>                 static viud destroy_unc_path(whchar_t* s) {
>>>>                 os::free(s); }
>>>>
>>>>             I've added the destroy function.
>>>>
>>>>
>>>>                 These two functions could be candidates of putting
>>>>                 into os::win32 namespace, as this form of ANSI->UNC
>>>>                 path translation is quite common - whoever wants to
>>>>                 use the xxxxW() functions must do this.
>>>>
>>>>             I'm leaving them in os_windows.cpp since they're being
>>>>             used only within that file.
>>>>
>>>>
>>>>         Fine by me.
>>>>
>>>>
>>>>
>>>>                 -----------------------
>>>>
>>>>                 FindFirstFileW:
>>>>
>>>>                 I am pretty sure that you can achieve the same
>>>>                 result with GetFileAttributesExW(). It also returns
>>>>                 WIN32_FIND_DATAW.
>>>>
>>>>             It actually returns WIN32_FILE_ATTRIBUTE_DATA and is
>>>>             very similar to WIN32_FIND_DATAW.
>>>>
>>>>
>>>>                 It is way more straightforward to use than
>>>>                 FindFirstFileW, as it does not require you to write
>>>>                 a callback hook.
>>>>
>>>>             I've switched to using GetFileAttributesExW().
>>>>
>>>>
>>>>         Thank you, this is way more readable.
>>>>         Another issue: do we still need the fix for 6539723 if we
>>>>         switch from stat() to GetFileAttributesExW()? This fix
>>>>         looks important, but the comment
>>>>         indicates that it could break things if the original bug is
>>>>         not present.
>>>>
>>>>         Btw, this is another strong argument for scrapping ::stat()
>>>>         altogether on all code paths, not only for long input
>>>>         paths, because stat() and GetFileAttributesExW() may behave
>>>>         differently. So it would be good to use the same API on all
>>>>         code paths, in order to get the best test coverage.
>>>         For this round of change, I'm using
>>>         GetFileAttributesEx[A|W]() for both code paths.
>>>
>>>         webrev:
>>>         http://cr.openjdk.java.net/~ccheung/8188122/webrev.03/
>>>         <http://cr.openjdk.java.net/%7Eccheung/8188122/webrev.03/>
>>>
>>>         thanks,
>>>         Calvin
>>>
>>>
>>>     Okay, all good apart from the issues mentioned above. Thanks for
>>>     your work!
>>>
>>>     Best Regards, Thomas
>>>
>>>
>>>
>>>>
>>>>
>>>>
>>>>                 -------
>>>>
>>>>                 eval_find_data(): This is more of a generic helper
>>>>                 function, could you rename this to something
>>>>                 clearer, e.g. make_double_word() ?
>>>>
>>>>             Ok. I've renamed it.
>>>>
>>>>
>>>>                 Also, setup_stat(), could this be renamed more
>>>>                 clearly to something like WIN32_FIND_DATA_to_stat?
>>>>                 or lowercase if this bothers you :)
>>>>
>>>>             I'm naming the function as
>>>>             file_attribute_data_to_stat() to match with the data
>>>>             structure name.
>>>>
>>>>
>>>>         Thanks for taking my suggestions.
>>>>
>>>>
>>>>
>>>>                 ==================================
>>>>                 src/hotspot/share/classfile/classLoader.cpp
>>>>
>>>>                 In ClassPathDirEntry::open_stream(), I would feel
>>>>                 better if we were asserting _dir and name to be !=
>>>>                 NULL before feeding it to strlen.
>>>>
>>>>             I've added an assert statement.
>>>>
>>>>
>>>>                 ===================
>>>>
>>>>             Here's an updated webrev:
>>>>             http://cr.openjdk.java.net/~ccheung/8188122/webrev.02/
>>>>             <http://cr.openjdk.java.net/%7Eccheung/8188122/webrev.02/>
>>>>
>>>>
>>>>         Thanks!
>>>>
>>>>         Thomas
>>>>
>>>>             thanks,
>>>>             Calvin
>>>>
>>>>
>>>>                 Thanks!
>>>>
>>>>                 Thomas
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>                 On Wed, Oct 25, 2017 at 9:48 PM, Calvin Cheung
>>>>                 <[hidden email]
>>>>                 <mailto:[hidden email]>
>>>>                 <mailto:[hidden email]
>>>>                 <mailto:[hidden email]>>> wrote:
>>>>
>>>>                     I've reworked this fix by using the Unicode
>>>>                 version of open
>>>>                     (wopen) to handle path name longer than max
>>>>                 path with the path
>>>>                     prefixed to indicate an extended length path as
>>>>                 described in [1].
>>>>
>>>>                     The Unicode version of stat (wstat) doesn't
>>>>                 work well with long
>>>>                     path [2]. So FindFirstFileW will be used.The
>>>>                 data in
>>>>                     WIN32_FIND_DATA returned from FindFirstFileW
>>>>                 needs to be
>>>>                     transferred to the stat structure since the
>>>>                 caller expects a
>>>>                     return stat structure and other platforms
>>>>                 return a stat structure.
>>>>
>>>>                     In classLoader.cpp, calculate the size of
>>>>                 buffer required instead
>>>>                     of limiting it to JVM_MAXPATHLEN.
>>>>                     In os_windows.cpp, dynamically allocate buffers
>>>>                 in os::open and
>>>>                     os::stat.
>>>>
>>>>                     updated webrev:
>>>>                 http://cr.openjdk.java.net/~ccheung/8188122/webrev.01/
>>>>                 <http://cr.openjdk.java.net/%7Eccheung/8188122/webrev.01/>
>>>>                 <http://cr.openjdk.java.net/%7Eccheung/8188122/webrev.01/
>>>>                 <http://cr.openjdk.java.net/%7Eccheung/8188122/webrev.01/>>
>>>>
>>>>                     It passed hs-tier2 testing using mach5.
>>>>
>>>>                     thanks,
>>>>                     Calvin
>>>>
>>>>                     [1]
>>>>                 https://msdn.microsoft.com/en-us/library/windows/desktop/aa365247(v=vs.85).aspx#MAX_PATH
>>>>                 <https://msdn.microsoft.com/en-us/library/windows/desktop/aa365247%28v=vs.85%29.aspx#MAX_PATH>
>>>>                 <https://msdn.microsoft.com/en-us/library/windows/desktop/aa365247%28v=vs.85%29.aspx#MAX_PATH
>>>>                 <https://msdn.microsoft.com/en-us/library/windows/desktop/aa365247%28v=vs.85%29.aspx#MAX_PATH>>
>>>>
>>>>                     [2]
>>>>                 https://social.msdn.microsoft.com/Forums/vstudio/en-US/3c093ea9-f0aa-446d-b648-2dabe8480430/stat-and-long-names?forum=vcgeneral
>>>>                 <https://social.msdn.microsoft.com/Forums/vstudio/en-US/3c093ea9-f0aa-446d-b648-2dabe8480430/stat-and-long-names?forum=vcgeneral>
>>>>                 <https://social.msdn.microsoft.com/Forums/vstudio/en-US/3c093ea9-f0aa-446d-b648-2dabe8480430/stat-and-long-names?forum=vcgeneral
>>>>                 <https://social.msdn.microsoft.com/Forums/vstudio/en-US/3c093ea9-f0aa-446d-b648-2dabe8480430/stat-and-long-names?forum=vcgeneral>>
>>>>
>>>>
>>>>
>>>>                     On 10/16/17, 3:15 PM, Calvin Cheung wrote:
>>>>
>>>>                         JBS:
>>>>                 https://bugs.openjdk.java.net/browse/JDK-8188122
>>>>                 <https://bugs.openjdk.java.net/browse/JDK-8188122>
>>>>                 <https://bugs.openjdk.java.net/browse/JDK-8188122
>>>>                 <https://bugs.openjdk.java.net/browse/JDK-8188122>>
>>>>
>>>>                         Adding a warning message if the full path
>>>>                 or the directory
>>>>                         length exceeds MAX_PATH on windows.
>>>>
>>>>                         Example warning messages.
>>>>
>>>>                         1) The full path exceeds MAX_PATH:
>>>>
>>>>                         Java HotSpot(TM) 64-Bit Server VM warning:
>>>>                 construct full path
>>>>                         name failed: total length 270 exceeds max
>>>>                 length of 260
>>>>                             dir
>>>>                        
>>>>                 T:\\testoutput\\jtreg\\JTwork\\classes\\2\\runtime\\LoadClass\\LongPath.d\\xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
>>>>                         length 259
>>>>                             name Hello.class length 11
>>>>
>>>>                         2) The directory path exceeds MAX_PATH:
>>>>
>>>>                         Java HotSpot(TM) 64-Bit Server VM warning:
>>>>                 os::stat failed:
>>>>                         path length 265 exceeds max length 260
>>>>                             path
>>>>                        
>>>>                 T:\\testoutput\\jtreg\\JTwork\\classes\\2\\runtime\\LoadClass\\LongPath.d\\xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx\\xxxxx
>>>>
>>>>                         webrev:
>>>>                 http://cr.openjdk.java.net/~ccheung/8188122/webrev.00/
>>>>                 <http://cr.openjdk.java.net/%7Eccheung/8188122/webrev.00/>
>>>>                 <http://cr.openjdk.java.net/%7Eccheung/8188122/webrev.00/
>>>>                 <http://cr.openjdk.java.net/%7Eccheung/8188122/webrev.00/>>
>>>>
>>>>                         Testing:
>>>>                             JPRT (including the new test)
>>>>
>>>>                         thanks,
>>>>                         Calvin
>>>>
>>>>
>>>>
>>>
>>
Reply | Threaded
Open this post in threaded view
|

Re: RFR(S): 8188122: Path length limits on Windows leads to obscure class loading failures

Ioi Lam
Looks good. Thanks!

- Ioi


On 11/20/17 12:05 PM, Calvin Cheung wrote:

> I've had some off-list discussion with Ioi resulting in another update
> to the webrev.
>
> - added relative path scenario to the test. Currently, this fix
> doesn't handle relative path on windows yet. The following RFE has
> been filed to cover long relative paths on windows:
> https://bugs.openjdk.java.net/browse/JDK-8191521
> - renamed the test LongPath.java to LongBCP.java to better reflect
> what is being tested;
> - some comments update on os_windows.cpp
>
> updated webrev:
>     http://cr.openjdk.java.net/~ccheung/8188122/webrev.05/
>
> thanks,
> Calvin
>
> On 11/9/17, 9:23 AM, Calvin Cheung wrote:
>> Hi Thomas,
>>
>> On 11/8/17, 10:40 PM, Thomas Stüfe wrote:
>>> Hi Calvin,
>>>
>>> On Wed, Nov 8, 2017 at 6:27 PM, Calvin Cheung
>>> <[hidden email] <mailto:[hidden email]>> wrote:
>>>
>>>
>>>
>>>     On 11/7/17, 6:12 AM, Thomas Stüfe wrote:
>>>>     Hi Calvin,
>>>>
>>>>     On Wed, Nov 1, 2017 at 8:07 PM, Calvin Cheung
>>>>     <[hidden email] <mailto:[hidden email]>>
>>>> wrote:
>>>>
>>>>
>>>>
>>>>         On 10/27/17, 12:55 AM, Thomas Stüfe wrote:
>>>>>         Hi Calvin,
>>>>>
>>>>>         On Fri, Oct 27, 2017 at 2:03 AM, Calvin Cheung
>>>>>         <[hidden email]
>>>>>         <mailto:[hidden email]>> wrote:
>>>>>
>>>>>             Hi Thomas,
>>>>>
>>>>>             On 10/25/17, 11:54 PM, Thomas Stüfe wrote:
>>>>>
>>>>>                 Hi Calvin,
>>>>>
>>>>>                 this is a worthwhile addition, thank you for your
>>>>> work!
>>>>>
>>>>>             Thanks for your review.
>>>>>
>>>>>
>>>>>         Thanks for your work :)
>>>>>
>>>>>         First off, there is another issue in
>>>>>         file_attribute_data_to_stat(). We actually had this issue
>>>>>         before, but your work uncovered it:
>>>>>
>>>>>         According to POSIX
>>>>> (http://pubs.opengroup.org/onlinepubs/009695399/basedefs/sys/stat.h.html 
>>>>>
>>>>> <http://pubs.opengroup.org/onlinepubs/009695399/basedefs/sys/stat.h.html>)
>>>>>         and every unix manpage I looked at (solaris, linux, aix..),
>>>>>         st_ctime is not the file creation time but the last time
>>>>>         file status was changed. Only Microsoft gets this wrong in
>>>>>         their posix layer, its stat() function returns
>>>>> (https://msdn.microsoft.com/en-us/library/14h5k7ff.aspx
>>>>> <https://msdn.microsoft.com/en-us/library/14h5k7ff.aspx>)
>>>>>         creation time in st_ctime.
>>>>>
>>>>>         But as os::stat() is a platform independent layer, I'd say
>>>>>         we should behave the same on all platforms, and that would
>>>>>         be the Posix way.
>>>>>
>>>>>         I did not find any code calling os::stat() and using
>>>>>         st_ctime, so this is still save to change for windows.
>>>>>         (Note that I found that perfMemory_xxx.cpp uses plain OS
>>>>>         ::stat and misuses ctime as "creation time" - I opened a
>>>>>         bug for that
>>>>>         (https://bugs.openjdk.java.net/browse/JDK-8190260
>>>>> <https://bugs.openjdk.java.net/browse/JDK-8190260> - but
>>>>>         that does not affect us, as they do not call os::stat()).
>>>>>
>>>>>         There is one more complication: in os::stat() we use plain
>>>>>         ::stat() in the single byte case.:
>>>>>
>>>>>         *+ if (strlen(path) < MAX_PATH) {*
>>>>>         *+     ret = ::stat(pathbuf, sbuf);*
>>>>>         *+   } else {*
>>>>>         *
>>>>>         *
>>>>>         But ::stat() on Windows is broken, as I wrote above. We
>>>>>         should not use it, especially not if we change the meaing
>>>>>         of st_ctime in the double byte path.
>>>>>
>>>>>         My suggestion would be to just always call
>>>>>         GetFileAttributes(), also for the single byte path. A very
>>>>>         simple solution would be to just always go the double byte
>>>>>         path with UNC translation, regardless of the path
>>>>>         length. Or, if you are worried about performance, for paths
>>>>>         shorter than MAX_PATH we use GetFileAttributesA(), for
>>>>>         longer paths GetFileAttributesW with UNC translation. In
>>>>>         both cases you get a WIN32_FILE_ATTRIBUTE_DATA which you
>>>>>         can feed tp your  file_attribute_data_to_stat() routine and
>>>>>         have the struct stat you return be always consistent.
>>>>         I ran into an issue with the dwFileAttributes value for a
>>>>         jar file. On Windows Server O/S, the value is set to 0x2020
>>>>         which is (FILE_ATTRIBUTE_NOT_CONTENT_INDEXED |
>>>>         FILE_ATTRIBUTE_ARCHIVE) but on a desktop O/S like Windows 7,
>>>>         it is set to 0x0020 which is just FILE_ATTRIBUTE_ARCHIVE.
>>>>         I've fixed it in file_attribute_data_to_stat().
>>>>
>>>>
>>>>     Oh.. :( good catch! But that opens a new can of worms I did not
>>>>     see before:
>>>>
>>>>     FILE_ATTRIBUTE_NORMAL is documented as "A file that does not
>>>>     have other attributes set. This attribute is valid only when
>>>>     used alone." In addition to this flag, we have a multitude of
>>>>     things like FILE_ATTRIBUTE_ARCHIVE, FILE_ATTRIBUTE_ENCRYPTED,
>>>>     FILE_ATTRIBUTE_READONLY ... etc, all cases where we assume this
>>>>     is a normal file in Unix terminology and where we would expect
>>>>     os::stat to return S_IFREG, but where according to the msdn doc
>>>>     FILE_ATTRIBUTE_NORMAL is not set.
>>>>
>>>>     Looking at what others do in this scenario (Looked at mingw
>>>>     sources and at ReactOS - I am not posting any source code here
>>>>     for legal reasons but feel free to look for yourself), the usual
>>>>     way to translate WIN32_FILE_ATTRIBUTES to struct stat seems to be:
>>>>     "if FILE_ATTRIBUTE_DIRECTORY then set S_IFDIR else S_IFREG" (so,
>>>>     no dependency on FILE_ATTRIBUTE_NORMAL).
>>>     This makes sense but I ran into similar problem as before - the
>>>     dwFileAttributes has a different value on a windows server O/S vs
>>>     desktop O/S. So I need to do the check as follows:
>>>
>>>     +   // A directory has the dwFileAttributes value of 0x2010
>>> which is
>>>     +   // (FILE_ATTRIBUTE_NOT_CONTENT_INDEXED |
>>> FILE_ATTRIBUTE_ARCHIVE)
>>>     +   // on Windows Server O/S but the value is 0x0010 on Windows
>>> desktop O/S
>>>     +   // such as Windows 7.
>>>     +   if ((file_data.dwFileAttributes& FILE_ATTRIBUTE_DIRECTORY)
>>> != 0) {
>>>     +     sbuf->st_mode |= S_IFDIR;
>>>     +   } else {
>>>     +     sbuf->st_mode |= S_IFREG;
>>>     +   }
>>>
>>> I scratched my head a bit about the comment till I understood that
>>> you mean "if FILE_ATTRIBUTE_DIRECTORY bit is set it is a directory
>>> regardless of which other flags are set" instead of "if
>>> flags==FILE_ATTRIBUTE_DIRECTORY it is a directory". Sure, I guess my
>>> comment above was sloppy, but this was what I meant. I am not even
>>> sure the comment is needed, this is quite self-explaining.
>> I've noticed a typo in the above comment:
>> +   // (FILE_ATTRIBUTE_NOT_CONTENT_INDEXED | FILE_ATTRIBUTE_ARCHIVE)
>>
>> FILE_ATTRIBUTE_ARCHIVE should be FILE_ATTRIBUTE_DIRECTORY
>>
>> I'll correct it before push.
>>
>>>
>>>     updated webrev:
>>>     http://cr.openjdk.java.net/~ccheung/8188122/webrev.04/
>>> <http://cr.openjdk.java.net/%7Eccheung/8188122/webrev.04/>
>>>
>>>
>>> I am fine with all the changes now. Thank you for your work and
>>> patience.
>> Thanks for your discussions and review.
>>
>> thanks,
>> Calvin
>>>
>>> Kind Regards, Thomas
>>>
>>>
>>>
>>>
>>>     Diff from webrev.03:
>>>
>>>     < --- old/src/hotspot/os/windows/os_windows.cpp 2017-11-08
>>>     08:50:40.170786397 -0800
>>>     < +++ new/src/hotspot/os/windows/os_windows.cpp 2017-11-08
>>>     08:50:39.803751877 -0800
>>>     < @@ -4060,41 +4060,119 @@
>>>     ---
>>>     > --- old/src/hotspot/os/windows/os_windows.cpp 2017-11-01
>>>     09:40:13.657460425 -0700
>>>     > +++ new/src/hotspot/os/windows/os_windows.cpp 2017-11-01
>>>     09:40:13.261423192 -0700
>>>     > @@ -4060,41 +4060,121 @@
>>>     25,29c25
>>>     < +  // A directory has the dwFileAttributes value of 0x2010
>>> which is
>>>     < +  // (FILE_ATTRIBUTE_NOT_CONTENT_INDEXED |
>>> FILE_ATTRIBUTE_ARCHIVE)
>>>     < +  // on Windows Server O/S but the value is 0x0010 on Windows
>>>     desktop O/S
>>>     < +  // such as Windows 7.
>>>     < +  if ((file_data.dwFileAttributes & FILE_ATTRIBUTE_DIRECTORY)
>>>     != 0) {
>>>     ---
>>>     > +  if (file_data.dwFileAttributes == FILE_ATTRIBUTE_DIRECTORY) {
>>>     31c27,33
>>>     < +  } else {
>>>     ---
>>>     > +  }
>>>     > +  if ((file_data.dwFileAttributes == FILE_ATTRIBUTE_NORMAL) ||
>>>     > +      // an archive file such as a jar file has the
>>>     dwFileAttributes value of
>>>     > +      // 0x2020 (FILE_ATTRIBUTE_NOT_CONTENT_INDEXED |
>>>     FILE_ATTRIBUTE_ARCHIVE)
>>>     > +      // on Windows Server O/S but the value is 0x0020 on
>>>     > +      // Windows desktop O/S such as Windows 7.
>>>     > +      ((file_data.dwFileAttributes & FILE_ATTRIBUTE_ARCHIVE)
>>>     != 0)) {
>>>
>>>>
>>>>     I wonder about other special cases which should work too:
>>>>     - read-only files should be S_IFREG and !S_IWRITE,
>>>     For a read-only system file under the user's home dir.
>>>
>>>     st_mode & 0xFF00 = 0x8100 = S_IFREG | S_IREAD
>>>     dwFileAttributes = 39 = (FILE_ATTRIBUTE_ARCHIVE |
>>>     FILE_ATTRIBUTE_HIDDEN | FILE_ATTRIBUTE_SYSTEM |
>>>     FILE_ATTRIBUTE_READONLY)
>>>>     read-only directories should be S_IFDIR and !S_IWRITE.
>>>     I've tried the C:\progra~1 dir.
>>>
>>>     st_mode & 0xFF00 = 0x4100 = S_IFDIR | S_IREAD
>>>     dwFileAttributes = 17 = (FILE_ATTRIBUTE_DIRECTORY |
>>>     FILE_ATTRIBUTE_READONLY)
>>>>     - We should tread the device root ("C:\") as a directory (so,
>>>>     os::stat("c:\") should return S_IFDIR). Does this work?
>>>     This one works too.
>>>
>>>     st_mode & 0xFF00 = 0x4100 = S_IFDIR | S_IREAD
>>>     dwFileAttributes = 22 = (FILE_ATTRIBUTE_DIRECTORY |
>>>     FILE_ATTRIBUTE_HIDDEN | FILE_ATTRIBUTE_SYSTEM)
>>>>
>>>>         I've also used GetTickCounts() to measure the performance of
>>>>         ::stat() vs GetFileAttributesA() plus
>>>>         file_attribute_data_to_stat(). There's no difference at the
>>>>         ms resolution. Using the high-resolution timestamp
>>>> (https://msdn.microsoft.com/en-us/library/windows/desktop/dn553408(v=vs.85).aspx)
>>>> <https://msdn.microsoft.com/en-us/library/windows/desktop/dn553408%28v=vs.85%29.aspx%29>,
>>>>         it doesn't show much difference.
>>>>
>>>>
>>>>     stat() seems to be implemented using FindFirstFile() (see crt
>>>>     sources if you can), and we call GetFileAttributesA(), so I do
>>>>     not think this differs much.
>>>     Yes, I saw the same in my Visual Studio installation.
>>>
>>>     thanks,
>>>     Calvin
>>>
>>>>>
>>>>>         But onward:
>>>>>
>>>>>
>>>>>
>>>>>                 =========================
>>>>>
>>>>>                 src/hotspot/os/windows/os_windows.cpp
>>>>>
>>>>>                 Could you please use wchar_t instead of WCHAR?
>>>>>
>>>>>             Fixed.
>>>>>
>>>>>
>>>>>                 ---------------
>>>>>                 in os::stat():
>>>>>
>>>>>                 This cumbersome malloc/strcpy sequence:
>>>>>
>>>>>                 !   size_t path_len = strlen(path) + 1;
>>>>>                 +   char* pathbuf = (char*)os::malloc(path_len *
>>>>>                 sizeof(char), mtInternal);
>>>>>                 +   if (pathbuf == NULL) {
>>>>>                 +     errno = ENOMEM;
>>>>>                       return -1;
>>>>>                     }
>>>>>                     os::native_path(strcpy(pathbuf, path));
>>>>>
>>>>>                  can be reduced to a simple strdup:
>>>>>
>>>>>                 char* pathbuf = os::strdup(path, mtInternal);
>>>>>                 if (pathbuf == NULL) {
>>>>>                   errno = ENOMEM;
>>>>>                   return -1;
>>>>>                 }
>>>>>                 os::native_path(pathbuf);
>>>>>
>>>>>                 This also would separate strcpy() from
>>>>>                 os::native_path() which is a bit unreadable.
>>>>>
>>>>>             I've made the change.
>>>>>
>>>>>
>>>>>                 --------------------
>>>>>
>>>>>                 The single-byte path (strdup, ::stat()), together
>>>>>                 with its free(), can be moved inside the
>>>>>                 (strlen(path) < MAX_PATH) condition.
>>>>>                 os::native_path will not change the path length (it
>>>>>                 better not, as it operates on the input buffer).
>>>>>                 That avoids  having two allocations on the
>>>>>                 doublebyte path.
>>>>>
>>>>>
>>>>>             os::native_path() converts a path like C:\\\\somedir to
>>>>>             C:\\somedir
>>>>>             So I'll need the converted path in the wchar_t case
>>>>>             too. The freeing of the pathbuf is being done at the
>>>>>             end of the function or in the middle of the wchar_t
>>>>>             case if there's an error.
>>>>>
>>>>>
>>>>>         Oh, you are right,of course. I missed that it this is
>>>>>         needed for both paths.
>>>>>
>>>>>
>>>>>
>>>>>                 -----------------------
>>>>>
>>>>>                 But seeing that translation to UNC paths is
>>>>>                 something we might want more often, I would combine
>>>>>                 allocation and UNC prefix adding to one function
>>>>>                 like this, to avoid the memove and increase
>>>>>                 reusability:
>>>>>
>>>>>                 // Creates an unc path from a single byte path.
>>>>>                 Return buffer is allocated in C heap and needs to
>>>>>                 be freed by caller. Returns NULL on error.
>>>>>                 static whchar_t* create_unc_path(const char* s) {
>>>>>                   - does s start with \\?\ ?
>>>>>                 - yes:
>>>>>                             - os::malloc(strlen(s) + 1) and
>>>>> mbstowcs_s
>>>>>                         - no:
>>>>>                             - os::malloc(strlen(s) + 1 + 4),
>>>>>                 mbstowcs_s to fourth position in string, prefix
>>>>>                 with L"\\?\"
>>>>>                 }
>>>>>
>>>>>
>>>>>             I also include the case for adding L"\\\\?\\UNC\0" at
>>>>>             the beginning to be consistent with
>>>>>             libjava/canonicalize_md.c.
>>>>>
>>>>>                 We also need error checking to mbstowcs_s.
>>>>>
>>>>>             I've added assert like the following after the call:
>>>>>
>>>>>             assert(converted_chars == path_len, "sanity");
>>>>>
>>>>>
>>>>>
>>>>>         create_unc_path() :
>>>>>
>>>>>         - could you convert the /**/ to // comments, please?
>>>>         Fixed.
>>>>
>>>>
>>>>     thanks.
>>>>
>>>>
>>>>>
>>>>>         - not sure about the assert after mbstowcs_s: if we happen
>>>>>         to encounter an invalid multibyte character, function will
>>>>>         fail and return an error:
>>>>>
>>>>> https://msdn.microsoft.com/en-us/library/eyktyxsx.aspx
>>>>> <https://msdn.microsoft.com/en-us/library/eyktyxsx.aspx>
>>>>>         "If mbstowcs_s encounters an invalid multibyte character,
>>>>>         it puts 0 in *``pReturnValue, sets the destination buffer
>>>>>         to an empty string, sets errno to EILSEQ, and returns
>>>>> EILSEQ."
>>>>         I've changed create_unc_path() so that the caller will get
>>>>         the errno and removed the assert.
>>>>
>>>>         static wchar_t* create_unc_path(const char* path, errno_t
>>>> &err)
>>>>
>>>>
>>>>     Okay, works for me.
>>>>
>>>>
>>>>>
>>>>>          As this is dependent on user data, we should not assert,
>>>>>         but handle the return code of mbstowcs_s gracefully. Same
>>>>>         goes for libjava/canonicalize_md.c.
>>>>>
>>>>>         - Here: ::mbstowcs_s(&converted_chars, &wpath[7], path_len
>>>>>         + 7, path, path_len);
>>>>>         third parameter is wrong, as we hand in an offset into the
>>>>>         buffer, we must decrement the buffer size by this offset,
>>>>>         so correct would be path_len +7 - 7 or just path_len.
>>>>>
>>>>>         - Same error below: + ::mbstowcs_s(&converted_chars,
>>>>>         &wpath[4], path_len + 4, path, path_len);
>>>>         Fixed in both places.
>>>>
>>>>
>>>>     Okay.
>>>>
>>>>
>>>>>
>>>>>
>>>>>                 Just for cleanliness, I would then wrap
>>>>>                 deallocation into an own function.
>>>>>
>>>>>                 static viud destroy_unc_path(whchar_t* s) {
>>>>>                 os::free(s); }
>>>>>
>>>>>             I've added the destroy function.
>>>>>
>>>>>
>>>>>                 These two functions could be candidates of putting
>>>>>                 into os::win32 namespace, as this form of ANSI->UNC
>>>>>                 path translation is quite common - whoever wants to
>>>>>                 use the xxxxW() functions must do this.
>>>>>
>>>>>             I'm leaving them in os_windows.cpp since they're being
>>>>>             used only within that file.
>>>>>
>>>>>
>>>>>         Fine by me.
>>>>>
>>>>>
>>>>>
>>>>>                 -----------------------
>>>>>
>>>>>                 FindFirstFileW:
>>>>>
>>>>>                 I am pretty sure that you can achieve the same
>>>>>                 result with GetFileAttributesExW(). It also returns
>>>>>                 WIN32_FIND_DATAW.
>>>>>
>>>>>             It actually returns WIN32_FILE_ATTRIBUTE_DATA and is
>>>>>             very similar to WIN32_FIND_DATAW.
>>>>>
>>>>>
>>>>>                 It is way more straightforward to use than
>>>>>                 FindFirstFileW, as it does not require you to write
>>>>>                 a callback hook.
>>>>>
>>>>>             I've switched to using GetFileAttributesExW().
>>>>>
>>>>>
>>>>>         Thank you, this is way more readable.
>>>>>         Another issue: do we still need the fix for 6539723 if we
>>>>>         switch from stat() to GetFileAttributesExW()? This fix
>>>>>         looks important, but the comment
>>>>>         indicates that it could break things if the original bug is
>>>>>         not present.
>>>>>
>>>>>         Btw, this is another strong argument for scrapping ::stat()
>>>>>         altogether on all code paths, not only for long input
>>>>>         paths, because stat() and GetFileAttributesExW() may behave
>>>>>         differently. So it would be good to use the same API on all
>>>>>         code paths, in order to get the best test coverage.
>>>>         For this round of change, I'm using
>>>>         GetFileAttributesEx[A|W]() for both code paths.
>>>>
>>>>         webrev:
>>>> http://cr.openjdk.java.net/~ccheung/8188122/webrev.03/
>>>> <http://cr.openjdk.java.net/%7Eccheung/8188122/webrev.03/>
>>>>
>>>>         thanks,
>>>>         Calvin
>>>>
>>>>
>>>>     Okay, all good apart from the issues mentioned above. Thanks for
>>>>     your work!
>>>>
>>>>     Best Regards, Thomas
>>>>
>>>>
>>>>
>>>>>
>>>>>
>>>>>
>>>>>                 -------
>>>>>
>>>>>                 eval_find_data(): This is more of a generic helper
>>>>>                 function, could you rename this to something
>>>>>                 clearer, e.g. make_double_word() ?
>>>>>
>>>>>             Ok. I've renamed it.
>>>>>
>>>>>
>>>>>                 Also, setup_stat(), could this be renamed more
>>>>>                 clearly to something like WIN32_FIND_DATA_to_stat?
>>>>>                 or lowercase if this bothers you :)
>>>>>
>>>>>             I'm naming the function as
>>>>>             file_attribute_data_to_stat() to match with the data
>>>>>             structure name.
>>>>>
>>>>>
>>>>>         Thanks for taking my suggestions.
>>>>>
>>>>>
>>>>>
>>>>>                 ==================================
>>>>> src/hotspot/share/classfile/classLoader.cpp
>>>>>
>>>>>                 In ClassPathDirEntry::open_stream(), I would feel
>>>>>                 better if we were asserting _dir and name to be !=
>>>>>                 NULL before feeding it to strlen.
>>>>>
>>>>>             I've added an assert statement.
>>>>>
>>>>>
>>>>>                 ===================
>>>>>
>>>>>             Here's an updated webrev:
>>>>> http://cr.openjdk.java.net/~ccheung/8188122/webrev.02/
>>>>> <http://cr.openjdk.java.net/%7Eccheung/8188122/webrev.02/>
>>>>>
>>>>>
>>>>>         Thanks!
>>>>>
>>>>>         Thomas
>>>>>
>>>>>             thanks,
>>>>>             Calvin
>>>>>
>>>>>
>>>>>                 Thanks!
>>>>>
>>>>>                 Thomas
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>                 On Wed, Oct 25, 2017 at 9:48 PM, Calvin Cheung
>>>>>                 <[hidden email]
>>>>>                 <mailto:[hidden email]>
>>>>>                 <mailto:[hidden email]
>>>>> <mailto:[hidden email]>>> wrote:
>>>>>
>>>>>                     I've reworked this fix by using the Unicode
>>>>>                 version of open
>>>>>                     (wopen) to handle path name longer than max
>>>>>                 path with the path
>>>>>                     prefixed to indicate an extended length path as
>>>>>                 described in [1].
>>>>>
>>>>>                     The Unicode version of stat (wstat) doesn't
>>>>>                 work well with long
>>>>>                     path [2]. So FindFirstFileW will be used.The
>>>>>                 data in
>>>>>                     WIN32_FIND_DATA returned from FindFirstFileW
>>>>>                 needs to be
>>>>>                     transferred to the stat structure since the
>>>>>                 caller expects a
>>>>>                     return stat structure and other platforms
>>>>>                 return a stat structure.
>>>>>
>>>>>                     In classLoader.cpp, calculate the size of
>>>>>                 buffer required instead
>>>>>                     of limiting it to JVM_MAXPATHLEN.
>>>>>                     In os_windows.cpp, dynamically allocate buffers
>>>>>                 in os::open and
>>>>>                     os::stat.
>>>>>
>>>>>                     updated webrev:
>>>>> http://cr.openjdk.java.net/~ccheung/8188122/webrev.01/
>>>>> <http://cr.openjdk.java.net/%7Eccheung/8188122/webrev.01/>
>>>>> <http://cr.openjdk.java.net/%7Eccheung/8188122/webrev.01/
>>>>> <http://cr.openjdk.java.net/%7Eccheung/8188122/webrev.01/>>
>>>>>
>>>>>                     It passed hs-tier2 testing using mach5.
>>>>>
>>>>>                     thanks,
>>>>>                     Calvin
>>>>>
>>>>>                     [1]
>>>>> https://msdn.microsoft.com/en-us/library/windows/desktop/aa365247(v=vs.85).aspx#MAX_PATH
>>>>> <https://msdn.microsoft.com/en-us/library/windows/desktop/aa365247%28v=vs.85%29.aspx#MAX_PATH>
>>>>> <https://msdn.microsoft.com/en-us/library/windows/desktop/aa365247%28v=vs.85%29.aspx#MAX_PATH
>>>>> <https://msdn.microsoft.com/en-us/library/windows/desktop/aa365247%28v=vs.85%29.aspx#MAX_PATH>>
>>>>>
>>>>>                     [2]
>>>>> https://social.msdn.microsoft.com/Forums/vstudio/en-US/3c093ea9-f0aa-446d-b648-2dabe8480430/stat-and-long-names?forum=vcgeneral
>>>>> <https://social.msdn.microsoft.com/Forums/vstudio/en-US/3c093ea9-f0aa-446d-b648-2dabe8480430/stat-and-long-names?forum=vcgeneral>
>>>>> <https://social.msdn.microsoft.com/Forums/vstudio/en-US/3c093ea9-f0aa-446d-b648-2dabe8480430/stat-and-long-names?forum=vcgeneral
>>>>> <https://social.msdn.microsoft.com/Forums/vstudio/en-US/3c093ea9-f0aa-446d-b648-2dabe8480430/stat-and-long-names?forum=vcgeneral>>
>>>>>
>>>>>
>>>>>
>>>>>                     On 10/16/17, 3:15 PM, Calvin Cheung wrote:
>>>>>
>>>>>                         JBS:
>>>>> https://bugs.openjdk.java.net/browse/JDK-8188122
>>>>> <https://bugs.openjdk.java.net/browse/JDK-8188122>
>>>>> <https://bugs.openjdk.java.net/browse/JDK-8188122
>>>>> <https://bugs.openjdk.java.net/browse/JDK-8188122>>
>>>>>
>>>>>                         Adding a warning message if the full path
>>>>>                 or the directory
>>>>>                         length exceeds MAX_PATH on windows.
>>>>>
>>>>>                         Example warning messages.
>>>>>
>>>>>                         1) The full path exceeds MAX_PATH:
>>>>>
>>>>>                         Java HotSpot(TM) 64-Bit Server VM warning:
>>>>>                 construct full path
>>>>>                         name failed: total length 270 exceeds max
>>>>>                 length of 260
>>>>>                             dir
>>>>> T:\\testoutput\\jtreg\\JTwork\\classes\\2\\runtime\\LoadClass\\LongPath.d\\xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
>>>>>                         length 259
>>>>>                             name Hello.class length 11
>>>>>
>>>>>                         2) The directory path exceeds MAX_PATH:
>>>>>
>>>>>                         Java HotSpot(TM) 64-Bit Server VM warning:
>>>>>                 os::stat failed:
>>>>>                         path length 265 exceeds max length 260
>>>>>                             path
>>>>> T:\\testoutput\\jtreg\\JTwork\\classes\\2\\runtime\\LoadClass\\LongPath.d\\xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx\\xxxxx
>>>>>
>>>>>                         webrev:
>>>>> http://cr.openjdk.java.net/~ccheung/8188122/webrev.00/
>>>>> <http://cr.openjdk.java.net/%7Eccheung/8188122/webrev.00/>
>>>>> <http://cr.openjdk.java.net/%7Eccheung/8188122/webrev.00/
>>>>> <http://cr.openjdk.java.net/%7Eccheung/8188122/webrev.00/>>
>>>>>
>>>>>                         Testing:
>>>>>                             JPRT (including the new test)
>>>>>
>>>>>                         thanks,
>>>>>                         Calvin
>>>>>
>>>>>
>>>>>
>>>>
>>>

Reply | Threaded
Open this post in threaded view
|

Re: RFR(S): 8188122: Path length limits on Windows leads to obscure class loading failures

Calvin Cheung
Thanks, Ioi.

Calvin

On 11/21/17, 8:36 AM, Ioi Lam wrote:

> Looks good. Thanks!
>
> - Ioi
>
>
> On 11/20/17 12:05 PM, Calvin Cheung wrote:
>> I've had some off-list discussion with Ioi resulting in another
>> update to the webrev.
>>
>> - added relative path scenario to the test. Currently, this fix
>> doesn't handle relative path on windows yet. The following RFE has
>> been filed to cover long relative paths on windows:
>> https://bugs.openjdk.java.net/browse/JDK-8191521
>> - renamed the test LongPath.java to LongBCP.java to better reflect
>> what is being tested;
>> - some comments update on os_windows.cpp
>>
>> updated webrev:
>>     http://cr.openjdk.java.net/~ccheung/8188122/webrev.05/
>>
>> thanks,
>> Calvin
>>
>> On 11/9/17, 9:23 AM, Calvin Cheung wrote:
>>> Hi Thomas,
>>>
>>> On 11/8/17, 10:40 PM, Thomas Stüfe wrote:
>>>> Hi Calvin,
>>>>
>>>> On Wed, Nov 8, 2017 at 6:27 PM, Calvin Cheung
>>>> <[hidden email] <mailto:[hidden email]>> wrote:
>>>>
>>>>
>>>>
>>>>     On 11/7/17, 6:12 AM, Thomas Stüfe wrote:
>>>>>     Hi Calvin,
>>>>>
>>>>>     On Wed, Nov 1, 2017 at 8:07 PM, Calvin Cheung
>>>>> <[hidden email] <mailto:[hidden email]>> wrote:
>>>>>
>>>>>
>>>>>
>>>>>         On 10/27/17, 12:55 AM, Thomas Stüfe wrote:
>>>>>>         Hi Calvin,
>>>>>>
>>>>>>         On Fri, Oct 27, 2017 at 2:03 AM, Calvin Cheung
>>>>>> <[hidden email]
>>>>>> <mailto:[hidden email]>> wrote:
>>>>>>
>>>>>>             Hi Thomas,
>>>>>>
>>>>>>             On 10/25/17, 11:54 PM, Thomas Stüfe wrote:
>>>>>>
>>>>>>                 Hi Calvin,
>>>>>>
>>>>>>                 this is a worthwhile addition, thank you for your
>>>>>> work!
>>>>>>
>>>>>>             Thanks for your review.
>>>>>>
>>>>>>
>>>>>>         Thanks for your work :)
>>>>>>
>>>>>>         First off, there is another issue in
>>>>>>         file_attribute_data_to_stat(). We actually had this issue
>>>>>>         before, but your work uncovered it:
>>>>>>
>>>>>>         According to POSIX
>>>>>> (http://pubs.opengroup.org/onlinepubs/009695399/basedefs/sys/stat.h.html 
>>>>>>
>>>>>> <http://pubs.opengroup.org/onlinepubs/009695399/basedefs/sys/stat.h.html>)
>>>>>>
>>>>>>         and every unix manpage I looked at (solaris, linux, aix..),
>>>>>>         st_ctime is not the file creation time but the last time
>>>>>>         file status was changed. Only Microsoft gets this wrong in
>>>>>>         their posix layer, its stat() function returns
>>>>>> (https://msdn.microsoft.com/en-us/library/14h5k7ff.aspx
>>>>>> <https://msdn.microsoft.com/en-us/library/14h5k7ff.aspx>)
>>>>>>         creation time in st_ctime.
>>>>>>
>>>>>>         But as os::stat() is a platform independent layer, I'd say
>>>>>>         we should behave the same on all platforms, and that would
>>>>>>         be the Posix way.
>>>>>>
>>>>>>         I did not find any code calling os::stat() and using
>>>>>>         st_ctime, so this is still save to change for windows.
>>>>>>         (Note that I found that perfMemory_xxx.cpp uses plain OS
>>>>>>         ::stat and misuses ctime as "creation time" - I opened a
>>>>>>         bug for that
>>>>>>         (https://bugs.openjdk.java.net/browse/JDK-8190260
>>>>>> <https://bugs.openjdk.java.net/browse/JDK-8190260> - but
>>>>>>         that does not affect us, as they do not call os::stat()).
>>>>>>
>>>>>>         There is one more complication: in os::stat() we use plain
>>>>>>         ::stat() in the single byte case.:
>>>>>>
>>>>>>         *+ if (strlen(path) < MAX_PATH) {*
>>>>>>         *+     ret = ::stat(pathbuf, sbuf);*
>>>>>>         *+   } else {*
>>>>>>         *
>>>>>>         *
>>>>>>         But ::stat() on Windows is broken, as I wrote above. We
>>>>>>         should not use it, especially not if we change the meaing
>>>>>>         of st_ctime in the double byte path.
>>>>>>
>>>>>>         My suggestion would be to just always call
>>>>>>         GetFileAttributes(), also for the single byte path. A very
>>>>>>         simple solution would be to just always go the double byte
>>>>>>         path with UNC translation, regardless of the path
>>>>>>         length. Or, if you are worried about performance, for paths
>>>>>>         shorter than MAX_PATH we use GetFileAttributesA(), for
>>>>>>         longer paths GetFileAttributesW with UNC translation. In
>>>>>>         both cases you get a WIN32_FILE_ATTRIBUTE_DATA which you
>>>>>>         can feed tp your  file_attribute_data_to_stat() routine and
>>>>>>         have the struct stat you return be always consistent.
>>>>>         I ran into an issue with the dwFileAttributes value for a
>>>>>         jar file. On Windows Server O/S, the value is set to 0x2020
>>>>>         which is (FILE_ATTRIBUTE_NOT_CONTENT_INDEXED |
>>>>>         FILE_ATTRIBUTE_ARCHIVE) but on a desktop O/S like Windows 7,
>>>>>         it is set to 0x0020 which is just FILE_ATTRIBUTE_ARCHIVE.
>>>>>         I've fixed it in file_attribute_data_to_stat().
>>>>>
>>>>>
>>>>>     Oh.. :( good catch! But that opens a new can of worms I did not
>>>>>     see before:
>>>>>
>>>>>     FILE_ATTRIBUTE_NORMAL is documented as "A file that does not
>>>>>     have other attributes set. This attribute is valid only when
>>>>>     used alone." In addition to this flag, we have a multitude of
>>>>>     things like FILE_ATTRIBUTE_ARCHIVE, FILE_ATTRIBUTE_ENCRYPTED,
>>>>>     FILE_ATTRIBUTE_READONLY ... etc, all cases where we assume this
>>>>>     is a normal file in Unix terminology and where we would expect
>>>>>     os::stat to return S_IFREG, but where according to the msdn doc
>>>>>     FILE_ATTRIBUTE_NORMAL is not set.
>>>>>
>>>>>     Looking at what others do in this scenario (Looked at mingw
>>>>>     sources and at ReactOS - I am not posting any source code here
>>>>>     for legal reasons but feel free to look for yourself), the usual
>>>>>     way to translate WIN32_FILE_ATTRIBUTES to struct stat seems to
>>>>> be:
>>>>>     "if FILE_ATTRIBUTE_DIRECTORY then set S_IFDIR else S_IFREG" (so,
>>>>>     no dependency on FILE_ATTRIBUTE_NORMAL).
>>>>     This makes sense but I ran into similar problem as before - the
>>>>     dwFileAttributes has a different value on a windows server O/S vs
>>>>     desktop O/S. So I need to do the check as follows:
>>>>
>>>>     +   // A directory has the dwFileAttributes value of 0x2010
>>>> which is
>>>>     +   // (FILE_ATTRIBUTE_NOT_CONTENT_INDEXED |
>>>> FILE_ATTRIBUTE_ARCHIVE)
>>>>     +   // on Windows Server O/S but the value is 0x0010 on Windows
>>>> desktop O/S
>>>>     +   // such as Windows 7.
>>>>     +   if ((file_data.dwFileAttributes& FILE_ATTRIBUTE_DIRECTORY)
>>>> != 0) {
>>>>     +     sbuf->st_mode |= S_IFDIR;
>>>>     +   } else {
>>>>     +     sbuf->st_mode |= S_IFREG;
>>>>     +   }
>>>>
>>>> I scratched my head a bit about the comment till I understood that
>>>> you mean "if FILE_ATTRIBUTE_DIRECTORY bit is set it is a directory
>>>> regardless of which other flags are set" instead of "if
>>>> flags==FILE_ATTRIBUTE_DIRECTORY it is a directory". Sure, I guess
>>>> my comment above was sloppy, but this was what I meant. I am not
>>>> even sure the comment is needed, this is quite self-explaining.
>>> I've noticed a typo in the above comment:
>>> +   // (FILE_ATTRIBUTE_NOT_CONTENT_INDEXED | FILE_ATTRIBUTE_ARCHIVE)
>>>
>>> FILE_ATTRIBUTE_ARCHIVE should be FILE_ATTRIBUTE_DIRECTORY
>>>
>>> I'll correct it before push.
>>>
>>>>
>>>>     updated webrev:
>>>>     http://cr.openjdk.java.net/~ccheung/8188122/webrev.04/
>>>> <http://cr.openjdk.java.net/%7Eccheung/8188122/webrev.04/>
>>>>
>>>>
>>>> I am fine with all the changes now. Thank you for your work and
>>>> patience.
>>> Thanks for your discussions and review.
>>>
>>> thanks,
>>> Calvin
>>>>
>>>> Kind Regards, Thomas
>>>>
>>>>
>>>>
>>>>
>>>>     Diff from webrev.03:
>>>>
>>>> < --- old/src/hotspot/os/windows/os_windows.cpp 2017-11-08
>>>>     08:50:40.170786397 -0800
>>>> < +++ new/src/hotspot/os/windows/os_windows.cpp 2017-11-08
>>>>     08:50:39.803751877 -0800
>>>> < @@ -4060,41 +4060,119 @@
>>>>     ---
>>>> > --- old/src/hotspot/os/windows/os_windows.cpp 2017-11-01
>>>>     09:40:13.657460425 -0700
>>>> > +++ new/src/hotspot/os/windows/os_windows.cpp 2017-11-01
>>>>     09:40:13.261423192 -0700
>>>> > @@ -4060,41 +4060,121 @@
>>>>     25,29c25
>>>> < +  // A directory has the dwFileAttributes value of 0x2010 which is
>>>> < +  // (FILE_ATTRIBUTE_NOT_CONTENT_INDEXED | FILE_ATTRIBUTE_ARCHIVE)
>>>> < +  // on Windows Server O/S but the value is 0x0010 on Windows
>>>>     desktop O/S
>>>> < +  // such as Windows 7.
>>>> < +  if ((file_data.dwFileAttributes & FILE_ATTRIBUTE_DIRECTORY)
>>>>     != 0) {
>>>>     ---
>>>> > +  if (file_data.dwFileAttributes == FILE_ATTRIBUTE_DIRECTORY) {
>>>>     31c27,33
>>>> < +  } else {
>>>>     ---
>>>> > +  }
>>>> > +  if ((file_data.dwFileAttributes == FILE_ATTRIBUTE_NORMAL) ||
>>>> > +      // an archive file such as a jar file has the
>>>>     dwFileAttributes value of
>>>> > +      // 0x2020 (FILE_ATTRIBUTE_NOT_CONTENT_INDEXED |
>>>>     FILE_ATTRIBUTE_ARCHIVE)
>>>> > +      // on Windows Server O/S but the value is 0x0020 on
>>>> > +      // Windows desktop O/S such as Windows 7.
>>>> > +      ((file_data.dwFileAttributes & FILE_ATTRIBUTE_ARCHIVE)
>>>>     != 0)) {
>>>>
>>>>>
>>>>>     I wonder about other special cases which should work too:
>>>>>     - read-only files should be S_IFREG and !S_IWRITE,
>>>>     For a read-only system file under the user's home dir.
>>>>
>>>>     st_mode & 0xFF00 = 0x8100 = S_IFREG | S_IREAD
>>>>     dwFileAttributes = 39 = (FILE_ATTRIBUTE_ARCHIVE |
>>>>     FILE_ATTRIBUTE_HIDDEN | FILE_ATTRIBUTE_SYSTEM |
>>>>     FILE_ATTRIBUTE_READONLY)
>>>>>     read-only directories should be S_IFDIR and !S_IWRITE.
>>>>     I've tried the C:\progra~1 dir.
>>>>
>>>>     st_mode & 0xFF00 = 0x4100 = S_IFDIR | S_IREAD
>>>>     dwFileAttributes = 17 = (FILE_ATTRIBUTE_DIRECTORY |
>>>>     FILE_ATTRIBUTE_READONLY)
>>>>>     - We should tread the device root ("C:\") as a directory (so,
>>>>>     os::stat("c:\") should return S_IFDIR). Does this work?
>>>>     This one works too.
>>>>
>>>>     st_mode & 0xFF00 = 0x4100 = S_IFDIR | S_IREAD
>>>>     dwFileAttributes = 22 = (FILE_ATTRIBUTE_DIRECTORY |
>>>>     FILE_ATTRIBUTE_HIDDEN | FILE_ATTRIBUTE_SYSTEM)
>>>>>
>>>>>         I've also used GetTickCounts() to measure the performance of
>>>>>         ::stat() vs GetFileAttributesA() plus
>>>>>         file_attribute_data_to_stat(). There's no difference at the
>>>>>         ms resolution. Using the high-resolution timestamp
>>>>> (https://msdn.microsoft.com/en-us/library/windows/desktop/dn553408(v=vs.85).aspx)
>>>>>
>>>>> <https://msdn.microsoft.com/en-us/library/windows/desktop/dn553408%28v=vs.85%29.aspx%29>,
>>>>>
>>>>>         it doesn't show much difference.
>>>>>
>>>>>
>>>>>     stat() seems to be implemented using FindFirstFile() (see crt
>>>>>     sources if you can), and we call GetFileAttributesA(), so I do
>>>>>     not think this differs much.
>>>>     Yes, I saw the same in my Visual Studio installation.
>>>>
>>>>     thanks,
>>>>     Calvin
>>>>
>>>>>>
>>>>>>         But onward:
>>>>>>
>>>>>>
>>>>>>
>>>>>>                 =========================
>>>>>>
>>>>>>                 src/hotspot/os/windows/os_windows.cpp
>>>>>>
>>>>>>                 Could you please use wchar_t instead of WCHAR?
>>>>>>
>>>>>>             Fixed.
>>>>>>
>>>>>>
>>>>>>                 ---------------
>>>>>>                 in os::stat():
>>>>>>
>>>>>>                 This cumbersome malloc/strcpy sequence:
>>>>>>
>>>>>>                 !   size_t path_len = strlen(path) + 1;
>>>>>>                 +   char* pathbuf = (char*)os::malloc(path_len *
>>>>>>                 sizeof(char), mtInternal);
>>>>>>                 +   if (pathbuf == NULL) {
>>>>>>                 +     errno = ENOMEM;
>>>>>>                       return -1;
>>>>>>                     }
>>>>>>                     os::native_path(strcpy(pathbuf, path));
>>>>>>
>>>>>>                  can be reduced to a simple strdup:
>>>>>>
>>>>>>                 char* pathbuf = os::strdup(path, mtInternal);
>>>>>>                 if (pathbuf == NULL) {
>>>>>>                   errno = ENOMEM;
>>>>>>                   return -1;
>>>>>>                 }
>>>>>>                 os::native_path(pathbuf);
>>>>>>
>>>>>>                 This also would separate strcpy() from
>>>>>>                 os::native_path() which is a bit unreadable.
>>>>>>
>>>>>>             I've made the change.
>>>>>>
>>>>>>
>>>>>>                 --------------------
>>>>>>
>>>>>>                 The single-byte path (strdup, ::stat()), together
>>>>>>                 with its free(), can be moved inside the
>>>>>>                 (strlen(path) < MAX_PATH) condition.
>>>>>>                 os::native_path will not change the path length (it
>>>>>>                 better not, as it operates on the input buffer).
>>>>>>                 That avoids  having two allocations on the
>>>>>>                 doublebyte path.
>>>>>>
>>>>>>
>>>>>>             os::native_path() converts a path like C:\\\\somedir to
>>>>>>             C:\\somedir
>>>>>>             So I'll need the converted path in the wchar_t case
>>>>>>             too. The freeing of the pathbuf is being done at the
>>>>>>             end of the function or in the middle of the wchar_t
>>>>>>             case if there's an error.
>>>>>>
>>>>>>
>>>>>>         Oh, you are right,of course. I missed that it this is
>>>>>>         needed for both paths.
>>>>>>
>>>>>>
>>>>>>
>>>>>>                 -----------------------
>>>>>>
>>>>>>                 But seeing that translation to UNC paths is
>>>>>>                 something we might want more often, I would combine
>>>>>>                 allocation and UNC prefix adding to one function
>>>>>>                 like this, to avoid the memove and increase
>>>>>>                 reusability:
>>>>>>
>>>>>>                 // Creates an unc path from a single byte path.
>>>>>>                 Return buffer is allocated in C heap and needs to
>>>>>>                 be freed by caller. Returns NULL on error.
>>>>>>                 static whchar_t* create_unc_path(const char* s) {
>>>>>>                   - does s start with \\?\ ?
>>>>>>                 - yes:
>>>>>>                             - os::malloc(strlen(s) + 1) and
>>>>>> mbstowcs_s
>>>>>>                         - no:
>>>>>>                             - os::malloc(strlen(s) + 1 + 4),
>>>>>>                 mbstowcs_s to fourth position in string, prefix
>>>>>>                 with L"\\?\"
>>>>>>                 }
>>>>>>
>>>>>>
>>>>>>             I also include the case for adding L"\\\\?\\UNC\0" at
>>>>>>             the beginning to be consistent with
>>>>>>             libjava/canonicalize_md.c.
>>>>>>
>>>>>>                 We also need error checking to mbstowcs_s.
>>>>>>
>>>>>>             I've added assert like the following after the call:
>>>>>>
>>>>>>             assert(converted_chars == path_len, "sanity");
>>>>>>
>>>>>>
>>>>>>
>>>>>>         create_unc_path() :
>>>>>>
>>>>>>         - could you convert the /**/ to // comments, please?
>>>>>         Fixed.
>>>>>
>>>>>
>>>>>     thanks.
>>>>>
>>>>>
>>>>>>
>>>>>>         - not sure about the assert after mbstowcs_s: if we happen
>>>>>>         to encounter an invalid multibyte character, function will
>>>>>>         fail and return an error:
>>>>>>
>>>>>> https://msdn.microsoft.com/en-us/library/eyktyxsx.aspx
>>>>>> <https://msdn.microsoft.com/en-us/library/eyktyxsx.aspx>
>>>>>>         "If mbstowcs_s encounters an invalid multibyte character,
>>>>>>         it puts 0 in *``pReturnValue, sets the destination buffer
>>>>>>         to an empty string, sets errno to EILSEQ, and returns
>>>>>> EILSEQ."
>>>>>         I've changed create_unc_path() so that the caller will get
>>>>>         the errno and removed the assert.
>>>>>
>>>>>         static wchar_t* create_unc_path(const char* path, errno_t
>>>>> &err)
>>>>>
>>>>>
>>>>>     Okay, works for me.
>>>>>
>>>>>
>>>>>>
>>>>>>          As this is dependent on user data, we should not assert,
>>>>>>         but handle the return code of mbstowcs_s gracefully. Same
>>>>>>         goes for libjava/canonicalize_md.c.
>>>>>>
>>>>>>         - Here: ::mbstowcs_s(&converted_chars, &wpath[7], path_len
>>>>>>         + 7, path, path_len);
>>>>>>         third parameter is wrong, as we hand in an offset into the
>>>>>>         buffer, we must decrement the buffer size by this offset,
>>>>>>         so correct would be path_len +7 - 7 or just path_len.
>>>>>>
>>>>>>         - Same error below: + ::mbstowcs_s(&converted_chars,
>>>>>> &wpath[4], path_len + 4, path, path_len);
>>>>>         Fixed in both places.
>>>>>
>>>>>
>>>>>     Okay.
>>>>>
>>>>>
>>>>>>
>>>>>>
>>>>>>                 Just for cleanliness, I would then wrap
>>>>>>                 deallocation into an own function.
>>>>>>
>>>>>>                 static viud destroy_unc_path(whchar_t* s) {
>>>>>>                 os::free(s); }
>>>>>>
>>>>>>             I've added the destroy function.
>>>>>>
>>>>>>
>>>>>>                 These two functions could be candidates of putting
>>>>>>                 into os::win32 namespace, as this form of ANSI->UNC
>>>>>>                 path translation is quite common - whoever wants to
>>>>>>                 use the xxxxW() functions must do this.
>>>>>>
>>>>>>             I'm leaving them in os_windows.cpp since they're being
>>>>>>             used only within that file.
>>>>>>
>>>>>>
>>>>>>         Fine by me.
>>>>>>
>>>>>>
>>>>>>
>>>>>>                 -----------------------
>>>>>>
>>>>>>                 FindFirstFileW:
>>>>>>
>>>>>>                 I am pretty sure that you can achieve the same
>>>>>>                 result with GetFileAttributesExW(). It also returns
>>>>>>                 WIN32_FIND_DATAW.
>>>>>>
>>>>>>             It actually returns WIN32_FILE_ATTRIBUTE_DATA and is
>>>>>>             very similar to WIN32_FIND_DATAW.
>>>>>>
>>>>>>
>>>>>>                 It is way more straightforward to use than
>>>>>>                 FindFirstFileW, as it does not require you to write
>>>>>>                 a callback hook.
>>>>>>
>>>>>>             I've switched to using GetFileAttributesExW().
>>>>>>
>>>>>>
>>>>>>         Thank you, this is way more readable.
>>>>>>         Another issue: do we still need the fix for 6539723 if we
>>>>>>         switch from stat() to GetFileAttributesExW()? This fix
>>>>>>         looks important, but the comment
>>>>>>         indicates that it could break things if the original bug is
>>>>>>         not present.
>>>>>>
>>>>>>         Btw, this is another strong argument for scrapping ::stat()
>>>>>>         altogether on all code paths, not only for long input
>>>>>>         paths, because stat() and GetFileAttributesExW() may behave
>>>>>>         differently. So it would be good to use the same API on all
>>>>>>         code paths, in order to get the best test coverage.
>>>>>         For this round of change, I'm using
>>>>>         GetFileAttributesEx[A|W]() for both code paths.
>>>>>
>>>>>         webrev:
>>>>> http://cr.openjdk.java.net/~ccheung/8188122/webrev.03/
>>>>> <http://cr.openjdk.java.net/%7Eccheung/8188122/webrev.03/>
>>>>>
>>>>>         thanks,
>>>>>         Calvin
>>>>>
>>>>>
>>>>>     Okay, all good apart from the issues mentioned above. Thanks for
>>>>>     your work!
>>>>>
>>>>>     Best Regards, Thomas
>>>>>
>>>>>
>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>                 -------
>>>>>>
>>>>>>                 eval_find_data(): This is more of a generic helper
>>>>>>                 function, could you rename this to something
>>>>>>                 clearer, e.g. make_double_word() ?
>>>>>>
>>>>>>             Ok. I've renamed it.
>>>>>>
>>>>>>
>>>>>>                 Also, setup_stat(), could this be renamed more
>>>>>>                 clearly to something like WIN32_FIND_DATA_to_stat?
>>>>>>                 or lowercase if this bothers you :)
>>>>>>
>>>>>>             I'm naming the function as
>>>>>>             file_attribute_data_to_stat() to match with the data
>>>>>>             structure name.
>>>>>>
>>>>>>
>>>>>>         Thanks for taking my suggestions.
>>>>>>
>>>>>>
>>>>>>
>>>>>>                 ==================================
>>>>>> src/hotspot/share/classfile/classLoader.cpp
>>>>>>
>>>>>>                 In ClassPathDirEntry::open_stream(), I would feel
>>>>>>                 better if we were asserting _dir and name to be !=
>>>>>>                 NULL before feeding it to strlen.
>>>>>>
>>>>>>             I've added an assert statement.
>>>>>>
>>>>>>
>>>>>>                 ===================
>>>>>>
>>>>>>             Here's an updated webrev:
>>>>>> http://cr.openjdk.java.net/~ccheung/8188122/webrev.02/
>>>>>> <http://cr.openjdk.java.net/%7Eccheung/8188122/webrev.02/>
>>>>>>
>>>>>>
>>>>>>         Thanks!
>>>>>>
>>>>>>         Thomas
>>>>>>
>>>>>>             thanks,
>>>>>>             Calvin
>>>>>>
>>>>>>
>>>>>>                 Thanks!
>>>>>>
>>>>>>                 Thomas
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>                 On Wed, Oct 25, 2017 at 9:48 PM, Calvin Cheung
>>>>>> <[hidden email]
>>>>>> <mailto:[hidden email]>
>>>>>> <mailto:[hidden email]
>>>>>> <mailto:[hidden email]>>> wrote:
>>>>>>
>>>>>>                     I've reworked this fix by using the Unicode
>>>>>>                 version of open
>>>>>>                     (wopen) to handle path name longer than max
>>>>>>                 path with the path
>>>>>>                     prefixed to indicate an extended length path as
>>>>>>                 described in [1].
>>>>>>
>>>>>>                     The Unicode version of stat (wstat) doesn't
>>>>>>                 work well with long
>>>>>>                     path [2]. So FindFirstFileW will be used.The
>>>>>>                 data in
>>>>>>                     WIN32_FIND_DATA returned from FindFirstFileW
>>>>>>                 needs to be
>>>>>>                     transferred to the stat structure since the
>>>>>>                 caller expects a
>>>>>>                     return stat structure and other platforms
>>>>>>                 return a stat structure.
>>>>>>
>>>>>>                     In classLoader.cpp, calculate the size of
>>>>>>                 buffer required instead
>>>>>>                     of limiting it to JVM_MAXPATHLEN.
>>>>>>                     In os_windows.cpp, dynamically allocate buffers
>>>>>>                 in os::open and
>>>>>>                     os::stat.
>>>>>>
>>>>>>                     updated webrev:
>>>>>> http://cr.openjdk.java.net/~ccheung/8188122/webrev.01/
>>>>>> <http://cr.openjdk.java.net/%7Eccheung/8188122/webrev.01/>
>>>>>> <http://cr.openjdk.java.net/%7Eccheung/8188122/webrev.01/
>>>>>> <http://cr.openjdk.java.net/%7Eccheung/8188122/webrev.01/>>
>>>>>>
>>>>>>                     It passed hs-tier2 testing using mach5.
>>>>>>
>>>>>>                     thanks,
>>>>>>                     Calvin
>>>>>>
>>>>>>                     [1]
>>>>>> https://msdn.microsoft.com/en-us/library/windows/desktop/aa365247(v=vs.85).aspx#MAX_PATH 
>>>>>>
>>>>>> <https://msdn.microsoft.com/en-us/library/windows/desktop/aa365247%28v=vs.85%29.aspx#MAX_PATH>
>>>>>>
>>>>>> <https://msdn.microsoft.com/en-us/library/windows/desktop/aa365247%28v=vs.85%29.aspx#MAX_PATH 
>>>>>>
>>>>>> <https://msdn.microsoft.com/en-us/library/windows/desktop/aa365247%28v=vs.85%29.aspx#MAX_PATH>>
>>>>>>
>>>>>>
>>>>>>                     [2]
>>>>>> https://social.msdn.microsoft.com/Forums/vstudio/en-US/3c093ea9-f0aa-446d-b648-2dabe8480430/stat-and-long-names?forum=vcgeneral 
>>>>>>
>>>>>> <https://social.msdn.microsoft.com/Forums/vstudio/en-US/3c093ea9-f0aa-446d-b648-2dabe8480430/stat-and-long-names?forum=vcgeneral>
>>>>>>
>>>>>> <https://social.msdn.microsoft.com/Forums/vstudio/en-US/3c093ea9-f0aa-446d-b648-2dabe8480430/stat-and-long-names?forum=vcgeneral 
>>>>>>
>>>>>> <https://social.msdn.microsoft.com/Forums/vstudio/en-US/3c093ea9-f0aa-446d-b648-2dabe8480430/stat-and-long-names?forum=vcgeneral>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>                     On 10/16/17, 3:15 PM, Calvin Cheung wrote:
>>>>>>
>>>>>>                         JBS:
>>>>>> https://bugs.openjdk.java.net/browse/JDK-8188122
>>>>>> <https://bugs.openjdk.java.net/browse/JDK-8188122>
>>>>>> <https://bugs.openjdk.java.net/browse/JDK-8188122
>>>>>> <https://bugs.openjdk.java.net/browse/JDK-8188122>>
>>>>>>
>>>>>>                         Adding a warning message if the full path
>>>>>>                 or the directory
>>>>>>                         length exceeds MAX_PATH on windows.
>>>>>>
>>>>>>                         Example warning messages.
>>>>>>
>>>>>>                         1) The full path exceeds MAX_PATH:
>>>>>>
>>>>>>                         Java HotSpot(TM) 64-Bit Server VM warning:
>>>>>>                 construct full path
>>>>>>                         name failed: total length 270 exceeds max
>>>>>>                 length of 260
>>>>>>                             dir
>>>>>> T:\\testoutput\\jtreg\\JTwork\\classes\\2\\runtime\\LoadClass\\LongPath.d\\xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
>>>>>>
>>>>>>                         length 259
>>>>>>                             name Hello.class length 11
>>>>>>
>>>>>>                         2) The directory path exceeds MAX_PATH:
>>>>>>
>>>>>>                         Java HotSpot(TM) 64-Bit Server VM warning:
>>>>>>                 os::stat failed:
>>>>>>                         path length 265 exceeds max length 260
>>>>>>                             path
>>>>>> T:\\testoutput\\jtreg\\JTwork\\classes\\2\\runtime\\LoadClass\\LongPath.d\\xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx\\xxxxx
>>>>>>
>>>>>>
>>>>>>                         webrev:
>>>>>> http://cr.openjdk.java.net/~ccheung/8188122/webrev.00/
>>>>>> <http://cr.openjdk.java.net/%7Eccheung/8188122/webrev.00/>
>>>>>> <http://cr.openjdk.java.net/%7Eccheung/8188122/webrev.00/
>>>>>> <http://cr.openjdk.java.net/%7Eccheung/8188122/webrev.00/>>
>>>>>>
>>>>>>                         Testing:
>>>>>>                             JPRT (including the new test)
>>>>>>
>>>>>>                         thanks,
>>>>>>                         Calvin
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>
>