Compare Multiple Image Pairs (Like Millions) #33

RoberAlcaraz · 2024-10-01T21:07:16Z

Hi!
First of all, the work is amazing.
I am currently working in a project that involves comparing millions of image pairs (around 10M). Even though the demo is very useful, this approach takes too much time for this set of pairs.
Is it possible to compare multiple pairs at once doing something like batch processing or maybe doing parallelization? If you have any recommendations on how to approach this, I would greatly appreciate your input.
Thank you in advance :)
Rob

iago-suarez · 2024-10-02T03:24:15Z

Hi, Thanks for your feedback. It is an interesting question. First I would try to understand why do you have so many pairs. If it has to do with a large scene where one images has to be matched with many others, the first thing I would do it to pre-compute the Wireframe (points and lines) individually. Of course you can use batching to speed up things. Best, Iago. El 1 oct 2024, a las 23:07, Roberto Alcaraz ***@***.***> escribió: Hi! First of all, the work is amazing. I am currently working in a project that involves comparing millions of image pairs (around 10M). Even though the demo is very useful, this approach takes too much time for this set of pairs. Is it possible to compare multiple pairs at once doing something like batch processing or maybe doing parallelization? If you have any recommendations on how to approach this, I would greatly appreciate your input. Thank you in advance :) Rob — Reply to this email directly, view it on GitHub<#33>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/ABYQZ3PLPZASFBUG5QYB3V3ZZMFJVAVCNFSM6AAAAABPGM262SVHI2DSMVQWIX3LMV43ASLTON2WKOZSGU3DAMRRG43TQNI>. You are receiving this because you are subscribed to this thread.Message ID: ***@***.***>

RoberAlcaraz · 2024-10-02T06:56:11Z

Hi Iago,

Thank you for your quick reply!

Indeed, I am trying to compare multiple images to detect whether they show the same individual based on the points and lines of the pattern present in each image. This is why I have so many pairs to compare. Specifically, I have between 4.000 and 5.000 images, which results in a total of $\sum\frac{n(n+1)}{2} \approx 10.000.000$ image pairs to evaluate.

I appreciate your suggestion regarding pre-computing the Wireframe individually. Could you possibly provide an example or some guidance on how I could implement this pre-computation, or how to use batching effectively for speeding up the comparison process?

Any detailed example or reference would be immensely helpful.

Thank you again for your assistance! :)

Best,
Roberto

iago-suarez · 2024-10-03T18:13:08Z

Unfortunately I am ver busy and I wont have time to do it but it should be fairly simple. The code is separated by classes that take care of the different steps: Extraction, Matching, etc. If you debug the code and understand what classes are doing each part you should be able to do it on your own. Alternatively, if your project has some economic support, I can recommend you someone who could do it for you. Cheers, Iago. Enviado desde mi iPhone El 2 oct 2024, a las 8:56, Roberto Alcaraz ***@***.***> escribió: Hi Iago, Thank you for your quick reply! Indeed, I am trying to compare multiple images to detect whether they show the same individual based on the points and lines of the pattern present in each image. This is why I have so many pairs to compare. Specifically, I have between 4.000 and 5.000 images, which results in a total of $\sum\frac{n(n+1)}{2} \approx 10.000.000$ image pairs to evaluate. I appreciate your suggestion regarding pre-computing the Wireframe individually. Could you possibly provide an example or some guidance on how I could implement this pre-computation, or how to use batching effectively for speeding up the comparison process? Any detailed example or reference would be immensely helpful. Thank you again for your assistance! :) Best, Roberto — Reply to this email directly, view it on GitHub<#33 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/ABYQZ3M34ZTEEVLSGTRNLQDZZOKKBAVCNFSM6AAAAABPGM262SVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGOBXG42TGMRRGE>. You are receiving this because you commented.Message ID: ***@***.***>

RoberAlcaraz · 2024-10-14T07:36:19Z

Hi Iago,

I checked the model files and made some modifications to achieve the functionality I wanted:

Modifications

wireframe.py: In the _forward method, I introduced the h5py library to enable saving/loading of precomputed results (keypoints, scores, descriptors) to/from an HDF5 file. I added the save_path and image_id arguments to check if precomputed data exists for an image, preventing redundant computation.

import h5py  # Added for HDF5 support

def _forward(self, data, save_path=None, image_id=None):  # Added save_path and image_id
    # Check if precomputed data is available
    if save_path and image_id:  # New block to load precomputed data
        with h5py.File(save_path, "a") as hdf5_file:
            if image_id in hdf5_file:
                grp = hdf5_file[image_id]
                if "lines" in grp and "line_scores" in grp:
                    return {
                        "image": data["image"],
                        "keypoints": torch.tensor(grp["keypoints"]),
                        "keypoint_scores": torch.tensor(grp["keypoint_scores"]),
                        "descriptors": torch.tensor(grp["descriptors"]),
                        "lines": torch.tensor(grp["lines"]),
                        "line_scores": torch.tensor(grp["line_scores"]),
                        "pl_associativity": torch.tensor(grp["pl_associativity"]),
                        "lines_junc_idx": torch.tensor(grp["lines_junc_idx"]),
                    }

    # Original processing code here...

    # Save the computed lines and wireframe if `save_path` and `image_id` are provided
    if save_path and image_id:  # New block to save computed data to HDF5
        with h5py.File(save_path, "a") as hdf5_file:
            grp = hdf5_file.require_group(image_id)
            grp.create_dataset("image", data=data["image"].cpu().detach().numpy(), compression="gzip")
            grp.create_dataset("keypoints", data=all_points.cpu().detach().numpy(), compression="gzip")
            grp.create_dataset("keypoint_scores", data=all_scores.cpu().detach().numpy(), compression="gzip")
            grp.create_dataset("descriptors", data=all_descs.cpu().detach().numpy(), compression="gzip")
            grp.create_dataset("lines", data=lines.cpu().detach().numpy(), compression="gzip")
            grp.create_dataset("line_scores", data=line_scores.cpu().detach().numpy(), compression="gzip")
            grp.create_dataset("pl_associativity", data=pl_associativity.cpu().detach().numpy(), compression="gzip")
            grp.create_dataset("lines_junc_idx", data=lines_junc_idx.cpu().detach().numpy(), compression="gzip")

two_view_pipeline.py: I updated the _forward method to prevent redundant computation by adding the ability to check for precomputed data and streamline the use of matcher, filter, and solver components.

import h5py  # Added for HDF5 support

def _forward(self, data, pred):  # Added HDF5 check
    # Run the matcher if it exists in the configuration
    if self.conf.matcher.name:
        pred = {**pred, **self.matcher({**data, **pred})}

    # Run filter and solver if they are part of the pipeline configuration
    if self.conf.filter.name:
        pred = {**pred, **self.filter({**data, **pred})}

    if self.conf.solver.name:
        pred = {**pred, **self.solver({**data, **pred})}

    return pred

Usage

To compute wireframes: This example demonstrates how to compute and save wireframes, skipping images that have already been processed:

with h5py.File(wireframe_results_path, "w") as hdf5_file:
    for img_path in img_paths:
        image_id = f"{os.path.basename(os.path.dirname(img_path))}/{os.path.basename(img_path)}"
        if image_id in hdf5_file:
            print(f"Skipping {image_id}, already processed.")
            continue
        data = {"image": numpy_image_to_torch(cv2.imread(img_path, cv2.IMREAD_GRAYSCALE)).to(device)[None]}
        wireframe_result = wireframe._forward(data, save_path=wireframe_results_path, image_id=image_id)

To compute point and line matches between pairs: This example shows how precomputed features from the HDF5 file are loaded and passed to the TwoViewPipeline for matching:

def compute_point_and_line_matches(pipeline, precomputed_features, results, img_id0, img_id1):
    data = {
        "image0": precomputed_features[img_id0]["image"],
        "image1": precomputed_features[img_id1]["image"],
    }
    pred0, pred1 = precomputed_features[img_id0].copy(), precomputed_features[img_id1].copy()
    del pred0["image"], pred1["image"]
    pred = {**{k + "0": v for k, v in pred0.items()}, **{k + "1": v for k, v in pred1.items()}}
    match_result = pipeline._forward(data, pred)
    results.append((img_id0, img_id1, match_result["match_scores0"], match_result["line_match_scores0"]))

Thank you once again!
Roberto

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Compare Multiple Image Pairs (Like Millions) #33

Compare Multiple Image Pairs (Like Millions) #33

RoberAlcaraz commented Oct 1, 2024

iago-suarez commented Oct 2, 2024 via email

RoberAlcaraz commented Oct 2, 2024

iago-suarez commented Oct 3, 2024 via email

RoberAlcaraz commented Oct 14, 2024

Compare Multiple Image Pairs (Like Millions) #33

Compare Multiple Image Pairs (Like Millions) #33

Comments

RoberAlcaraz commented Oct 1, 2024

iago-suarez commented Oct 2, 2024 via email

RoberAlcaraz commented Oct 2, 2024

iago-suarez commented Oct 3, 2024 via email

RoberAlcaraz commented Oct 14, 2024

Modifications

Usage