Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix!: correctly detect if file is outside base path on Windows #59

Open
wants to merge 22 commits into
base: main
Choose a base branch
from

Conversation

fasttime
Copy link
Member

@fasttime fasttime commented Jun 17, 2024

Prerequisites checklist

What is the purpose of this pull request?

Fix logic that detects when a file is outside the base path so it works well with Windows-style paths. Also fixed an edge case with paths that start with ".." but are not external like "..file.js".

What changes did you make? (Give an overview)

Windows paths have some unique features not shared by Unix paths. For one, absolute local paths start with a drive letter followed by a colon like "C:". The fact that the drive letter must be the first character in a local path means that a relative path cannot specify a drive. In Node.js, path.relative() will just return the absolute path of the second argument when it cannot calculate a relative path across different drives, so for example with path.relative("C:\\dir", "D:\\file.js").

Besides local paths, Windows uses UNC paths in the form \\Server\Volume\dir1\dir2\file to specify files on a network. Those paths are always absolute.

Both config-array and ESLint use relative paths to check if a file is inside a base directory: if the file's relative path from the directory starts with "..", then the file is not in the directory. This check is currently failing when the base directory and the file were located on different drives or servers, because as previously mentioned path.relative() doesn't handle those cases.

The fix I've come up with consists in calculating what Node.js calls namespace-prefixed paths of the base directory and the file and calling path.relative() on those two. Namespaced paths are prefixed with \\?\ for local paths or with \\?\UNC for network paths. When the provided arguments are namespaced paths on different drives, path.relative() will return a path in the form ..\..\C:\dir1\dir2\file.js. While such a path is not valid as it contains a drive letter in middle position, it can be readily recognized as for being outside the base path because of the leading .. and discarded as external.

Related Issues

Is there anything you'd like reviewers to focus on?

@eslint-github-bot eslint-github-bot bot added the bug Something isn't working label Jun 17, 2024
@fasttime fasttime changed the title fix: correctly detect is file is outside base path on Windows fix: correctly detect if file is outside base path on Windows Jun 17, 2024
eslint.config.js Outdated Show resolved Hide resolved
Comment on lines -355 to -358
if (relativeFilePath.startsWith("..")) {
return true;
}
Copy link
Member Author

@fasttime fasttime Jun 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This check is already performed in getConfigWithStatus().


//-----------------------------------------------------------------------------
// Helpers
//-----------------------------------------------------------------------------

// calculate base path using import.meta
const basePath = path.dirname(new URL(import.meta.url).pathname);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On Windows, this was giving a path like /c:/foo/bar/rewrite/packages/config-array/tests, where the c: is considered a top-level directory rather than a drive specifier. This was producing incorrect results with path.resolve(basePath) like C:\c:\foo\bar\rewrite\packages\config-array\tests.

@@ -2860,6 +3010,30 @@ describe("ConfigArray", () => {
);
});

it("should return true when the directory is the parent of the base path", () => {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is to ensure that a relative path like ".." without a slash is recognized as external.

packages/config-array/src/config-array.js Outdated Show resolved Hide resolved
packages/config-array/src/config-array.js Outdated Show resolved Hide resolved
packages/config-array/src/config-array.js Outdated Show resolved Hide resolved
@fasttime fasttime marked this pull request as ready for review June 23, 2024 20:18
packages/config-array/src/config-array.js Outdated Show resolved Hide resolved
packages/config-array/src/config-array.js Outdated Show resolved Hide resolved
packages/config-array/src/to-namespaced-path.js Outdated Show resolved Hide resolved
}

// On non-Windows systems, `toNamespacedPath` returns the argument as is.
export default sep === "\\" ? win32ToNamespacedPath : arg => arg;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't seem like a safe check. Again, because we aren't going to know what format a string is in when it's passed in a browser environment, I don't think we can assume path.sep is accurate for the data we'll be getting.

It seems safer to check for the first instance of \ and / in the string? Something like:

const isWinPath = Boolean(filePath.indexOf("/") - file.Path.indexOf("\\"));

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The input file path could be relative, so there may not be enough indication to determine if it's a Unix or a Windows path. Another complication is that path.relative and path.dirname have different implementations for Unix and Windows (the path separator / vs. \ is not the only difference). These functions will return different results depending on the platform where Node.js runs even for the same input.

If ConfigArray needs to work with both Windows and Unix paths on any platform including browsers, and not depend on cwd at all, it would be better to avoid using node:path functions to manipulate paths and implement the whole logic internally.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If ConfigArray needs to work with both Windows and Unix paths on any platform including browsers, and not depend on cwd at all, it would be better to avoid using node:path functions to manipulate paths and implement the whole logic internally.

This was actually the intent. Originally, the only reference to path I had was path.relative, which I intended to replace with an inline function in order to avoid that dependency. My goal going forward was to make sure we weren't starting to use even more of path to ensure everything would work in browsers correctly.

Whatever we do, I just don't want to add even deeper dependency on the path module. If we need to rethink things a bit, then let's do that.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've started working on generic path handling helpers to handle both Unix and Windows style paths regardless of the current platform. I'm planning to show a prototype in this PR to get some feedback.

@nzakas
Copy link
Member

nzakas commented Aug 8, 2024

We may want to consider using the Deno standard path library, which is not runtime or OS specific:
https://jsr.io/@std/path

@fasttime
Copy link
Member Author

We may want to consider using the Deno standard path library, which is not runtime or OS specific: https://jsr.io/@std/path

It looks like that library contains different implementations for windows and posix, and by default, it selects the implementation for the current platform. That is similar to how Node.js node:path works.

It would be useful to find a library that can be installed via npm so we don't need to copy the source files manually.

@nzakas
Copy link
Member

nzakas commented Aug 13, 2024

npx jsr add @std/path works. 😄

@fasttime
Copy link
Member Author

Finally made it work as expected using only the Deno standard library for path handling and without considering the current directory. This fixes the way external files are matched on Windows by including drive letters or UNC server names when resolving relative paths.

An important note: external files are files located outside the base path, but the base path is an optional parameter. If no base path is specified, files will never be considered external. This changes the current behavior where the base path defaults to be the current directory if not specified (in ESLint, the base path is always specified).

@fasttime fasttime requested a review from nzakas August 19, 2024 08:14
@nzakas
Copy link
Member

nzakas commented Aug 20, 2024

An important note: external files are files located outside the base path, but the base path is an optional parameter. If no base path is specified, files will never be considered external. This changes the current behavior where the base path defaults to be the current directory if not specified (in ESLint, the base path is always specified).

This is actually the way I thought it would work when I first implemented it. 😄 I don't think this will affect users because we always pass basePath, but we may want to mark this as breaking for this package.

nzakas
nzakas previously approved these changes Aug 20, 2024
Copy link
Member

@nzakas nzakas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. As I mentioned, I think we should mark this as breaking.

Would like @mdjermanovic to review before merging.

@mdjermanovic mdjermanovic changed the title fix: correctly detect if file is outside base path on Windows fix!: correctly detect if file is outside base path on Windows Aug 25, 2024
@@ -1 +1,2 @@
package-lock = false
@jsr:registry=https://npm.jsr.io
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would end users need to add this to their .npmrc files?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've tried using npm pack-ed version of config-array from this branch in eslint/eslint, and npm install fails with:

$ npm i
npm ERR! code E404
npm ERR! 404 Not Found - GET https://registry.npmjs.org/@jsr%2fstd__path - Not found
npm ERR! 404
npm ERR! 404  '@jsr/std__path@^1.0.2' is not in this registry.
npm ERR! 404
npm ERR! 404 Note that you can also install from a
npm ERR! 404 tarball, folder, http url, or git url.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we can expect users to change their npm settings so that they can download packages from JSR. We could distribute @std/path in a separate directory inside config-array and import it from there. Or we could use the URL of the tarball from JSR to specify the dependecy in package.json:

-    "@jsr/std__path": "^1.0.2",
+    "@jsr/std__path": "https://npm.jsr.io/~/11/@jsr/std__path/1.0.2.tgz",

Either way, we will lose the magic of the caret ^ to pick newer versions automatically. I checked the JSR docs at https://jsr.io/docs/using-packages but I didn't find any recommendations on how to use their packages with npm apart from registering JSR in .npmrc.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can one of you open an issue on JSR for this? I can't imagine they didn't encounter this before, but before we spin our wheels, it would be nice to verify if we're doing things the right way.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I opened a discussion here: jsr-io/jsr#701

@nzakas
Copy link
Member

nzakas commented Sep 3, 2024

I'd say let's just use the URL for now. We can always update the approach if/when they respond to your question.

@fasttime
Copy link
Member Author

fasttime commented Sep 4, 2024

The build is now failing because the generated CommonJS module is trying to require() @jsr/std__path, which is an ESM-only package. In packages/config-array/dist/cjs/index.cjs I can see:

var posixPath = require('@jsr/std__path/posix');
var windowsPath = require('@jsr/std__path/windows');

I'm not sure why there was no build error in my previous commit. The error doesn't seem connected to the way @jsr/std__path is declared in package.json. Maybe we need to bundle @jsr/std__path in the CommonJS module so it gets transpiled.

@mdjermanovic
Copy link
Member

Maybe we need to bundle @jsr/std__path in the CommonJS module so it gets transpiled.

If we're going to bundle @jsr/std__path in CJS, then we could bundle it in ESM as well, to avoid having a URL depedency (which I think is not a common practice for packages published to npm).

@fasttime
Copy link
Member Author

fasttime commented Sep 8, 2024

Maybe we need to bundle @jsr/std__path in the CommonJS module so it gets transpiled.

If we're going to bundle @jsr/std__path in CJS, then we could bundle it in ESM as well, to avoid having a URL depedency (which I think is not a common practice for packages published to npm).

Thanks! I tried to bundle @jsr/std__path in dist/cjs/index.cjs and dist/esm/index.js, but then tsc started to complain because of the checkJs option in tsconfig.json:

dist/esm/index.js:316:46 - error TS1313: The body of an 'if' statement cannot be the empty statement.

316       if (lastSlash === i - 1 || dots === 1) ; else if (lastSlash !== i - 1 && dots === 2) {
                                                 ~

dist/esm/index.js:516:15 - error TS2339: Property 'Deno' does not exist on type 'typeof globalThis'.

516       const { Deno } = globalThis;
                  ~~~~

dist/esm/index.js:1148:5 - error TS2353: Object literal may only specify known properties, and 'extended' does not exist in type '{ globstar?: boolean; }'.

1148     extended,
         ~~~~~~~~

dist/esm/index.js:1590:46 - error TS1313: The body of an 'if' statement cannot be the empty statement.

1590       if (lastSlash === i - 1 || dots === 1) ; else if (lastSlash !== i - 1 && dots === 2) {
                                                  ~

dist/esm/index.js:1970:13 - error TS2339: Property 'Deno' does not exist on type 'typeof globalThis'.

1970     const { Deno } = globalThis;
                 ~~~~

dist/esm/index.js:2742:5 - error TS2353: Object literal may only specify known properties, and 'extended' does not exist in type '{ globstar?: boolean; }'.

2742     extended,
         ~~~~~~~~


Found 6 errors in the same file, starting at: dist/esm/index.js:316

So I decided to bundle @jsr/std__path/posix and @jsr/std__path/windows in separate modules that tsc would not check. This seems to be working well and results in the following files being generated:

dist/esm/types.d.ts
dist/esm/index.js
dist/esm/types.ts
dist/esm/std__path/posix.js
dist/esm/std__path/windows.js
dist/esm/index.d.ts
dist/cjs/index.d.cts
dist/cjs/types.ts
dist/cjs/index.cjs
dist/cjs/std__path/posix.cjs
dist/cjs/std__path/windows.cjs

I will update this PR shortly with the new solution, and hopefully the CI build will pass this time.

@mdjermanovic
Copy link
Member

Shall I update the PR to make/ the default basePath ?

Makes sense to me 👍

@fasttime
Copy link
Member Author

Thanks for the feedback. I've made basePath default to "/" in commit 193ece2 and cleaned up the code used to support a missing base path. Only test changes are required in ESLint in order to use the updated config-array: https://github.com/eslint/eslint/compare/update-for-rewrite-pr-59.

nzakas
nzakas previously approved these changes Oct 18, 2024
Copy link
Member

@nzakas nzakas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall LGTM. Just left a couple questions for clarification but neither are blockers.

packages/config-array/src/config-array.js Outdated Show resolved Hide resolved
Comment on lines 548 to 560
/** The namespaced path of the config file directory. */
#namespacedBasePath;

/** Path-handling implementations */
#path;

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add types to these properties?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in ab61572.

@@ -598,7 +599,7 @@ export class ConfigArray extends Array {
* @property basePath
* @type {string}
*/
this.basePath = basePath;
this.basePath = basePath || "/";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to double check -- should this be nullish or do we also want to check for empty string as an invalid value?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think It's a good idea to treat an empty string as an invalid value, but RuleTester is currently setting basePath to an empty string when given a relative file path (e.g. "somefile.js").

So if we are going to treat empty strings as invalid we'll need to make a small change in ESLint to ensure that basePath is undefined (or "/") in cases where it is currently set to an empty string in rule-tester.js, i.e.:

            if (item.filename) {
-                flatConfigArrayOptions.basePath = path.parse(item.filename).root;
+                flatConfigArrayOptions.basePath = path.parse(item.filename).root || void 0;
            }

in https://github.com/eslint/eslint/blob/v9.13.0/lib/rule-tester/rule-tester.js#L655-L657.

Shall we do this or is it better to treat an empty basePath as valid and keep the logic in RuleTester unchanged?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also think it's a good idea to treat an empty string as an invalid value.

That code was added in RuleTester to support absolute paths. I think it's fine to update it as suggested.

Users that pass cwd to Linter won't notice the difference as it already treats an empty string the same as undefined (https://github.com/eslint/eslint/blob/1a7cdbfad8d3355132aab3201c7dd3f2e879b19d/lib/linter/linter.js#L1201-L1212).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
accepted breaking bug Something isn't working
Projects
Status: Implementing
Development

Successfully merging this pull request may close these issues.

3 participants