Skip to content

Integration of LB Thesis#63

Closed
pmbittner wants to merge 83 commits into
commit-untanglingfrom
thesis_lb
Closed

Integration of LB Thesis#63
pmbittner wants to merge 83 commits into
commit-untanglingfrom
thesis_lb

Conversation

@pmbittner
Copy link
Copy Markdown
Member

No description provided.

TheBormann added 30 commits July 6, 2022 14:58
node duplication has errors => fromID() seems to have errors
Merged commits from the main repo
fix: remove brackets around single words in boolean abstraction
ibbem added 3 commits February 6, 2023 10:39
These functions might be useful for other use cases as well and fit into
these classes.
@ibbem
Copy link
Copy Markdown
Collaborator

ibbem commented Feb 9, 2023

As requested, I merged develop into this branch.
I continued to work on the analysis/validation refactoring, so sorry for the big diff :).
Of course, there is still cleanup potential but now I am quite happy about the general structure of the analysis/validation.

@ibbem
Copy link
Copy Markdown
Collaborator

ibbem commented Feb 9, 2023

@pmbittner ping for review

There are some behavioral changes:

- The
  > temporary fix for renaming from Unchanged to Untouched
  has been removed.

- There are two more metadata keys: `exportedCommits` and
  `exportedTrees`. These where previously called `processedCommits` and
  `processedTrees` and used with different meanings in the
  `DiffTreeMiner` and the validations which caused a bug increasing this
  these counters twice, although that was probably my mistake, introduced
  during refactoring or merging.

- The order of metadata snapshots has probably changed
Copy link
Copy Markdown
Member Author

@pmbittner pmbittner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @ibbem ,

thanks for the heavy refactoring of the analyses framework. It looks much better now. It was not easy to review it given the extensive amount of changes and the still complicated analyses we perform.

Two major things I noticed are:
1.) I did not review the FeatureSplit parts yet. I have to dig into those in the next weeks. Can you move all the new FeatureSplit related classes to dedicated sub-packages so that we can easily identify them after the merge? What is your opinion on the location of the new code?
2.) Please add some documentation (javadoc) to the new analyses framework. It is still complicated and guidance for developers on how to properly use it is missing.

After that we can merge. Thank you. :)

Comment thread src/main/java/org/variantsync/diffdetective/analysis/Analysis.java
Comment thread src/main/java/org/variantsync/diffdetective/analysis/Analysis.java
Comment thread src/main/java/org/variantsync/diffdetective/analysis/Analysis.java
while (hook.hasNext()) {
if (!callHook.apply(hook.next(), this)) {
return false;
}
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FilterHooks should be side-effect free but they aren't. For example, LineGraphExportAnalysis::onParsedCommit has a side-effect. When exitiving here as soon as some filter says "no", we would omit running the remaining side-effects = bug! So I would argue to just run all hooks and make a big AND of their return values.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No. Filter hooks are not required to be side-effect free. I will explain this in the upcoming documentation.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As discussed in our last meeting, the current behaviour is fine but should be documented.

import java.util.stream.Collectors;

/** Accumulates multiple {@link AnalysisResult}s of several datasets. */
public class MiningResultAccumulator {
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rename to AnalysesResultAccumulator?

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is a better name for the current implementation as it doesn't accumulate all results but only ExplainedFilterSummary, EditClassCount and StatisticsAnalysis which are created by the DiffTreeMiner.
We could generalize MiningResultAccumulator by either

  1. creating a registry for all Metadata so it can be created in AnalysesResultAccumulator, or
  2. adding the qualified class names to the serialization and creating the relevant classes at runtime.

Then AnalysesResultAccumulator would be appropriate.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's discuss that in a meeting. Putting effort in here only makes sense if we actually profit from such a refactoring.

@ibbem
Copy link
Copy Markdown
Collaborator

ibbem commented Feb 20, 2023

  1. I don't think it's a good idea to move all feature split related operations into a sub package. To me, it does make more sense to have a package with all transformation we can perform on DiffTrees as we have now.
  2. I am currently writing documentation. Stay tuned :)

During discussion you noted that thesis_lb should be essentially freezed and be reset to before the origin/develop merge and all these refactorings. As we loose all the above discussion I will postpone this action until the discussion is finished.

@ibbem ibbem mentioned this pull request Feb 20, 2023
@ibbem
Copy link
Copy Markdown
Collaborator

ibbem commented Feb 20, 2023

I integrated the changes into benjamin/analysis (#65) and thesis_lb-refactorings. For comparison I uploaded the state before these changes but after the rebase onto develop to benjamin/old/analysis-refactoring (Note: I made some cosmetic changes during the rebase to make the single commits more readable).

@pmbittner
Copy link
Copy Markdown
Member Author

#65 has been merged. Closing this PR now.

@pmbittner pmbittner closed this Feb 28, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bm_work ibbem is paid for working on this enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants