Pointpriors #663

bgctw · 2024-09-17T07:39:45Z

Tackles #662: Querying the log-density of components of the prior.

The implementation does not decompose the log-density of dot-tilde expressions, because a possible solution (first commit, but removed in 3rd commit again) would need to decompose dot_assume, which is not under context control. However, I do need to pass computation to child-contexts, because I want to inspect log-density transformation by child-contexts. Therefore, I called it varwise_logpriors rather than pointwise_logpriors.

In addition, I decided for a different handling of a Chains of samples compared to pointwise_likelihoods, because I did not fully comprehend its different push!! methods and different initializers for the collecting OrderedDict and what applies at which conditions. Rather, I tried separating concerns of querying densities for a single sample and applying it to a Chains object. I hope that the mutation of a pre-accocated array is ok here.

torfjelde

Hey @bgctw !

Can you clarify why you want the .~ statements to be treated as a single log-prob in your case? You mention that your motivation is tempering; it's a but unclear to me why varwise_logpriors are needed for this. And why is the Chain needed in this case? When I think of tempering in our context, I'm imaging altering the likelihood / prior weightings during sampling, not as a post-inference step.

Maybe even write a short bit of psuedo-code outlining what you want to do with this could help!

From your initial motivation in #662, I feel like we can probably find alternative approaches that might be a bit simpler:)

src/context_implementations.jl

bgctw · 2024-09-17T10:18:19Z

My goal is to modify the log-density during sampling. I imagine putting something similar to TestLogModifyingChildContext in src/test_utils.jl between the SamplingContext and the DefaultContext. For example, I want to relax the parameter priors of an ODE model during burnin or initial optimization of a Turing model, but keep the original priors on additive effects that modify parameters for simulated replicates around a population mean. However, before tackling this, I want to be able to query/see the corresponding log-densities that are used during the sampling or optimization.

Hence, I want to query the log-densities of the prior components as seen by a sampler that generated the samples in a AbstractMCMC.AbstractChains object.

The single number provided by logprior(m, vi) for a single sample is too coarse, because I want to experiment with components.

The pointwise resolution, i.e. resolving also the components of the log-density components of dot_tilde_assume such as (s[1], s[2], ...s[n]) of a s .~ Normal(...), would be nice, but it is more complex to implement together with the requirement, that those densities can be modified by a child-context. Reporting their cumulated logdensity only is an acceptable tradeoff. If this is prohibitive, the user could reformulate the .~ as an explicit loop, because then those components are resolved with the currently suggested implementation.

coveralls · 2024-09-17T11:41:18Z

Pull Request Test Coverage Report for Build 11091092915

Details

106 of 128 (82.81%) changed or added relevant lines in 2 files are covered.
No unchanged relevant lines lost coverage.
Overall coverage increased (+0.6%) to 78.105%

Changes Missing Coverage	Covered Lines	Changed/Added Lines	%
src/test_utils.jl	13	19	68.42%
src/pointwise_logdensities.jl	93	109	85.32%

Totals
Change from base Build 11051106360:	0.6%
Covered Lines:	2786
Relevant Lines:	3567

💛 - Coveralls

use loop for prior in example Unfortunately cannot make it a jldoctest, because relies on Turing for sampling

torfjelde · 2024-09-18T13:26:05Z

Hence, I want to query the log-densities of the prior components as seen by a sampler that generated the samples in a AbstractMCMC.AbstractChains object.

Ah, gotcha; this was the aspect I was missing 👍

The pointwise resolution, i.e. resolving also the components of the log-density components of dot_tilde_assume such as (s[1], s[2], ...s[n]) of a s .~ Normal(...), would be nice, but it is more complex to implement together with the requirement, that those densities can be modified by a child-context. Reporting their cumulated logdensity only is an acceptable tradeoff. If this is prohibitive, the user could reformulate the .~ as an explicit loop, because then those components are resolved with the currently suggested implementation.

Makes sense 👍

Taking this into account, I'm wondering if maybe it would be better to just generalize the existing PointwiseLikelihoodContext that we have here

DynamicPPL.jl/src/loglikelihoods.jl

Lines 2 to 5 in 24a7380

    
           struct PointwiseLikelihoodContext{A,Ctx} <: AbstractContext 
        
               loglikelihoods::A 
        
               context::Ctx 
        
           end

We can just add a "switch" to it (or maybe just inspect the leaf context) to determine what logprobs we should keep around. AFAIK this should just require implementing the following:

tilde_assume and dot_tilde_assume
A quick check in (1) to determine whether we should include a variable or not.

Then we can just add alternatives to the following user-facing method

DynamicPPL.jl/src/loglikelihoods.jl

Lines 230 to 257 in 24a7380

    
           function pointwise_loglikelihoods(model::Model, chain, keytype::Type{T}=String) where {T} 
        
               # Get the data by executing the model once 
        
               vi = VarInfo(model) 
        
               context = PointwiseLikelihoodContext(OrderedDict{T,Vector{Float64}}()) 
        
               iters = Iterators.product(1:size(chain, 1), 1:size(chain, 3)) 
        
               for (sample_idx, chain_idx) in iters 
        
                   # Update the values 
        
                   setval!(vi, chain, sample_idx, chain_idx) 
        
                   # Execute model 
        
                   model(vi, context) 
        
               end 
        
               niters = size(chain, 1) 
        
               nchains = size(chain, 3) 
        
               loglikelihoods = OrderedDict( 
        
                   varname => reshape(logliks, niters, nchains) for 
        
                   (varname, logliks) in context.loglikelihoods 
        
               ) 
        
               return loglikelihoods 
        
           end 
        
           function pointwise_loglikelihoods(model::Model, varinfo::AbstractVarInfo) 
        
               context = PointwiseLikelihoodContext(OrderedDict{VarName,Vector{Float64}}()) 
        
               model(varinfo, context) 
        
               return context.loglikelihoods 
        
           end

e.g. pointwise_prior_logprobs or something.

So all in all, basically what you've already done, but just as part of the PointwiseLikelihoodContext (which we should then subsequently rename of course).

Thoughts?

bgctw · 2024-09-18T14:18:03Z

Trying to unify those two is a good idea. In fact, I originally started exploring/modifying based on PointwiseLikelihoodContext.

However, I did not come far with this. PointwiseLikelihoodContext, resolves the dot_tilde_observe by intercepting before the agglogp!, but still can forward the density computation to the child context. I did not manage to do that with the priors. Hence, I do not know how to implement your unifying suggestion.

bgctw · 2024-09-19T04:22:34Z

I will attempt the implementation that you suggested, assuming that components of the prior are not resolved to the same detail as the components of the likelihood.

…_densities

bgctw · 2024-09-19T09:06:38Z

I pushed a new commit that integrates pointwise_loglikelihoods and varwise_logpriors to the new function pointwise_logdensities.

The hardest part was to create a single VarName from the AbstractVector{VarName} for the case where only the summed logdensity for several prior components in dot_tilde_assume is to be recorded. varwise_logpriors simply used a Symbol, but the generalized pointwise_loglikelihoods requires a single VarName. The implementation at src/pointwise_logdensities.jl around line 153 has to assume several details of the Optics in the given VarNames.

Another issue, is that now pointwise_loglikelihoods provides information on all variables, although the logdensity of the priors is zero. Hence, one cannot check on empty Result (around line 29) to catch the case of literal observations. How can I ask the model or VarInfo which variables are priors and which are observations?

I could not yet recreate julia-repl block in the documentation of the function, because current Turing, which is required for sampling in the docstring, is not compatible with current DynamicPPL.

torfjelde · 2024-09-19T12:58:18Z

Lovely @bgctw ! I'll a proper have a look at it a bit later today:)

bgctw · 2024-09-19T18:51:14Z

In order for the user to select relevant information and for saving processing time, it could be helpful to have two keyword arguments with defaults: report_logpriors=true and report_loglikelihoods=true. If the corresponding flag is false, log-densities would not be calculated (not passed to child context) and would not appear in the results. The report_logpriors could be set to false in the forwarding of pointwise_loglikelihoods which would also allow to check on empty results in tests again.

Would these be reasonable?

by forwarding dot_tilde_assume to tilde_assume

bgctw · 2024-09-21T11:03:10Z

I found a way to record single log-density prior components in dot_tilde_assume: I forward each variable to tilde_assume but currently take the value and VarInfo from dot_tilde_assume applied with the child context. This assumes that VarInfo is not mutated in tilde_assume.
There is still the case of tilde assignment to single multivariate distribution, where there is only a single log-density for a combination of VarNames, e.g. s .~ product_distribution([InverseGamma(2, 3) for _ in 1:d]) in TestUtils.demo_dot_assume_matrix_dot_observe_matrix. Hence, I do not get around combining indices of VarNames.

The forwarding of dot_tilde_assume to multiple tilde_assume works for the PointwiseLogdensityContext case. Is there potential to apply it also at other places to simplify DynamicPPL or is the separate dispatch mechanism important?

bgctw · 2024-09-22T05:04:53Z

Forwarding to tilde_assume now also works for the case of tilde assignment to a single multivariate distribution, e.g. s .~ product_distribution([InverseGamma(2, 3) for _ in 1:d]) in TestUtils.demo_dot_assume_matrix_dot_observe_matrix. No need any more for combining indices of VarNames.

torfjelde

Sorry, I was working on some changes in your branch and wanted to make a PR to yours, but doesn't seem like that works due to you being on a fork o.O (or maybe I'm just being stupid).

So instead I made a new PR over at #669 . You can see the diff from yours to mine that I added here: https://github.com/TuringLang/DynamicPPL.jl/pull/669/files/5842656154a5b2f9a0377c45a4d4438933971a11..8bd2085098208fc58d1e33bbe48ec56e7efcd691

EDIT: Did this because it was a bit easier to demonstrate what I had in mind rather than explaining it through a bunch of comments

src/deprecated.jl

src/test_utils.jl

test/pointwise_logdensities.jl

src/test_utils.jl

bgctw · 2024-09-23T14:04:36Z

So instead I made a new PR over at #669 . You can see the diff from yours to mine that I added here: https://github.com/TuringLang/DynamicPPL.jl/pull/669/files/5842656154a5b2f9a0377c45a4d4438933971a11..8bd2085098208fc58d1e33bbe48ec56e7efcd691

I see

the different undeprecated subtypes of pointwise_logdensities
the _istcontext to suppress entire prior or likelihood recording
I do not understand some of the code (some dubbed Hack)

Your PR is based on an older version of this PR. What is the way forward now? Should I try to merge your changes to this PR? Or should I try to implement my subsequent changes to your PR?
I am not as experienced with forks and contributing to pull-requests. How can I make my fork writeable/pushable to you?

and avoid recording likelihoods when invoked with leaf-Likelihood context

…gdensities mostly taken from TuringLang#669

bgctw first forwared dot_tilde_assume to get a correct vi and then recomputed it for recording component prior densities. Replaced this by the Hack of torfjelde that completely drops vi and recombines the value, so that assume is called only once for each varName,

pointwise_prior_logdensities int api.md docu

bgctw · 2024-09-24T07:10:51Z

I transferred the developments in #669 to this PR. The solution with dropping the updated VarInfos and only relying on the non-dot version is more efficient than my version of recomputing the VarInfo from the original dot-version of tilde_assume. Although, the "flatten and recombine" hack (line 230) for the Multivariate Distribution is hard to comprehend, and one needs to remember that when modifying dot_tilde_assume, one now needs to also consistently adapt _point_tilde_assume.

torfjelde · 2024-09-26T09:50:00Z

Then, samplers and other code would only deal with the non-dot versions. Maybe this makes a few performance and other tweaks impossible, such as samplers hooking into the dot-dispatch. But this would be more gentle compared to deprecate the support for .~ entirely.

Regarding this, I think it's worth preserving the .~ in the same sense as .= works in Julia:) It does both have performance implications and is a nice semantic for users. But yeah, agree that it's a bit annoying.

Co-authored-by: Tor Erlend Fjelde <[email protected]>

… on already used model Co-authored-by: Tor Erlend Fjelde <[email protected]>

to work with literal models

bgctw · 2024-09-26T10:43:55Z

The suggestions from code review introduced some errors in the tests, which I tried to fix. However, I did not succeed for the "pointwise_logdensities chain" testset. Could you, please, have another look, if this is a problem of the test setup or the tested functionality. Your test is more strict, because it compares to logjoint_true(model, val...) rather than pointwise_logdensities(model, VarInfo).

torfjelde · 2024-09-26T11:27:58Z

Ah yeah, it's failing because DynamicPPL.TestUtils.varnames(model) only returns the variables that are considered random, i.e. excluding the observations. Lemme have a go at fixing this 👍

test/pointwise_logdensities.jl

torfjelde · 2024-09-26T11:38:51Z

Fixed the test @bgctw :)

.gitignore

codecov · 2024-09-26T12:29:39Z

Codecov Report

Attention: Patch coverage is 82.81250% with 22 lines in your changes missing coverage. Please review.

Project coverage is 77.66%. Comparing base (067ac4c) to head (cff0941).
Report is 1 commits behind head on master.

Files with missing lines	Patch %	Lines
src/pointwise_logdensities.jl	85.32%	16 Missing ⚠️
src/test_utils.jl	68.42%	6 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##           master     #663      +/-   ##
==========================================
+ Coverage   75.93%   77.66%   +1.73%     
==========================================
  Files          29       29              
  Lines        3519     3587      +68     
==========================================
+ Hits         2672     2786     +114     
+ Misses        847      801      -46

Flag	Coverage Δ
	`77.66% <82.81%> (+1.73%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

bgctw · 2024-09-27T11:45:33Z

Thanks @torfjelde for patiently guiding me through this process.

bgctw · 2024-09-27T11:59:58Z

Another maybe:

I find it more convenient to work with the results of the pointwise functions applied to AbstractChains as an AbstractChains again, rather than the OrderedDict{String, Matrix}.
What is a good place to support this conversion? A function in the MCMCChains extension such as:

function as_chains(lds_pointwise)
     Chains(stack(values(lds_pointwise); dims=2), collect(keys(lds_pointwise)))
end
chn = as_chains(logjoints_pointwise); # from @testset "pointwise_logdensities chain"
names(chn)
get(chn, :x)[1] == logjoints_pointwise["x"]

One could even think of letting the pointwise_logdensities(..., ::AbstractChains) routinely return a Chains object.
This could be achieved by

renaming the pointwise_logdensities(..., ::AbstractChains) to pointwise_logdensities_dict,
implementing it in the DynamicPPLMCMCChains extension by converting the result of pointwise_logdensities_dict.

Since this would break the current interface of pointwise_logdensities, I make this suggestion here instead of an own issue, before pointwise_logdensities (and siblings) go to the master branch.

dependent on Turing.jl

torfjelde · 2024-09-27T13:42:28Z

I just pushed a final change to the docstring of pointwise_logdensities (which was out of date) + made it a doctest. Once tests pass now, I'll add it to the merge queue:)

Thanks @torfjelde for patiently guiding me through this process.

Of course! Glad to hear you found it useful:)

I find it more convenient to work with the results of the pointwise functions applied to AbstractChains as an AbstractChains again, rather than the OrderedDict{String, Matrix}.

Hmm, I'm a bit uncertain about this. I do see your reasoning that it might be beneficial, but I think, at least at the moment, I'm reluctant to make this part of DynamicPPL 😕 Generally, we adopt features in DPPL once we feel like there's sufficient need for it; atm, I think most people using pointwise_logdensities (me being amongst them), would rather work with an OrderedDict instead of Chains 😕

But how about you convert that comment into an issue so a) we can keep track of the desired feature and see if there are other people who share the interest in this, and b) so that the current impl you are using can also be discovered more easily by others?:)

bgctw · 2024-09-27T14:55:07Z

But how about you convert that comment into an issue so a) we can keep track of the desired feature and see if there are other people who share the interest in this, and b) so that the current impl you are using can also be discovered more easily by others?:)

I will do that after its available on master.

parsing issues

…ices

torfjelde · 2024-09-30T08:30:21Z

Added it to the merge queue; thank you @bgctw !

bgctw added 3 commits September 16, 2024 08:53

implement pointwise_logpriors

d05124c

implement varwise_logpriors

4f46102

remove pointwise_logpriors

c6653b9

torfjelde reviewed Sep 17, 2024

View reviewed changes

src/context_implementations.jl Outdated Show resolved Hide resolved

revert dot_assume to not explicitly resolve components of sum

216d50c

docstring varwise_logpriores

fd8d3b2

use loop for prior in example Unfortunately cannot make it a jldoctest, because relies on Turing for sampling

integrate pointwise_loglikelihoods and varwise_logpriors by pointwise…

5842656

…_densities

record single prior components

18beb57

by forwarding dot_tilde_assume to tilde_assume

forward dot_tilde_assume to tilde_assume for Multivariate

d9945d7

torfjelde mentioned this pull request Sep 23, 2024

Suggestions for pointwise_logdensities and siblings #669

Open

torfjelde reviewed Sep 23, 2024

View reviewed changes

src/deprecated.jl Outdated Show resolved Hide resolved

torfjelde reviewed Sep 23, 2024

View reviewed changes

src/test_utils.jl Outdated Show resolved Hide resolved

torfjelde reviewed Sep 23, 2024

View reviewed changes

test/pointwise_logdensities.jl Outdated Show resolved Hide resolved

torfjelde reviewed Sep 23, 2024

View reviewed changes

src/test_utils.jl Outdated Show resolved Hide resolved

bgctw added 4 commits September 24, 2024 07:42

avoid recording prior components on leaf-prior-context

656a757

and avoid recording likelihoods when invoked with leaf-Likelihood context

undeprecate pointwise_loglikelihoods and implement pointwise_prior_lo…

7aa9ebe

…gdensities mostly taken from TuringLang#669

include docstrings of pointwise_logdensities

9dfb9ed

pointwise_prior_logdensities int api.md docu

bgctw and others added 4 commits September 26, 2024 12:12

Apply suggestions from code review: clean up comments and Imports

17b251a

Co-authored-by: Tor Erlend Fjelde <[email protected]>

Apply suggestions from code review: change test of applying to chains…

7e990f0

… on already used model Co-authored-by: Tor Erlend Fjelde <[email protected]>

fix test on names in likelihood components

8706f68

to work with literal models

try to fix testset pointwise_logdensities chain

073a325

torfjelde reviewed Sep 26, 2024

View reviewed changes

test/pointwise_logdensities.jl Outdated Show resolved Hide resolved

Update test/pointwise_logdensities.jl

23e1711

torfjelde reviewed Sep 26, 2024

View reviewed changes

.gitignore Outdated Show resolved Hide resolved

torfjelde and others added 4 commits September 26, 2024 12:40

Update .gitignore

34ae4f8

Merge branch 'master' into pointpriors

1f251d1

Formtating

777624a

Fixed tests

4864e60

Updated docs for pointwise_logdensities + made it a doctest not

4d3b0c0

dependent on Turing.jl

Bump patch version

e54fa4e

torfjelde added 2 commits September 27, 2024 17:28

Remove blank line from @model in doctest to see if that fixes the

bcd82a9

parsing issues

Added doctest filter to handle the ;;] at the end of lines for matr…

cff0941

…ices

torfjelde approved these changes Sep 30, 2024

View reviewed changes

torfjelde enabled auto-merge September 30, 2024 08:30

torfjelde added this pull request to the merge queue Sep 30, 2024

Merged via the queue into TuringLang:master with commit 8c3aa44 Sep 30, 2024
12 of 13 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Pointpriors #663

Pointpriors #663

bgctw commented Sep 17, 2024

torfjelde left a comment •

edited

Loading

bgctw commented Sep 17, 2024 •

edited

Loading

coveralls commented Sep 17, 2024 •

edited

Loading

torfjelde commented Sep 18, 2024

bgctw commented Sep 18, 2024

bgctw commented Sep 19, 2024

bgctw commented Sep 19, 2024

torfjelde commented Sep 19, 2024

bgctw commented Sep 19, 2024

bgctw commented Sep 21, 2024

bgctw commented Sep 22, 2024

torfjelde left a comment •

edited

Loading

bgctw commented Sep 23, 2024 •

edited

Loading

bgctw commented Sep 24, 2024 •

edited

Loading

torfjelde commented Sep 26, 2024

bgctw commented Sep 26, 2024

torfjelde commented Sep 26, 2024

torfjelde commented Sep 26, 2024 •

edited

Loading

codecov bot commented Sep 26, 2024 •

edited

Loading

bgctw commented Sep 27, 2024

bgctw commented Sep 27, 2024 •

edited by torfjelde

Loading

torfjelde commented Sep 27, 2024

bgctw commented Sep 27, 2024

torfjelde commented Sep 30, 2024

Pointpriors #663

Pointpriors #663

Conversation

bgctw commented Sep 17, 2024

torfjelde left a comment • edited Loading

Choose a reason for hiding this comment

bgctw commented Sep 17, 2024 • edited Loading

coveralls commented Sep 17, 2024 • edited Loading

Pull Request Test Coverage Report for Build 11091092915

Details

💛 - Coveralls

torfjelde commented Sep 18, 2024

bgctw commented Sep 18, 2024

bgctw commented Sep 19, 2024

bgctw commented Sep 19, 2024

torfjelde commented Sep 19, 2024

bgctw commented Sep 19, 2024

bgctw commented Sep 21, 2024

bgctw commented Sep 22, 2024

torfjelde left a comment • edited Loading

Choose a reason for hiding this comment

bgctw commented Sep 23, 2024 • edited Loading

bgctw commented Sep 24, 2024 • edited Loading

torfjelde commented Sep 26, 2024

bgctw commented Sep 26, 2024

torfjelde commented Sep 26, 2024

torfjelde commented Sep 26, 2024 • edited Loading

codecov bot commented Sep 26, 2024 • edited Loading

Codecov Report

bgctw commented Sep 27, 2024

bgctw commented Sep 27, 2024 • edited by torfjelde Loading

torfjelde commented Sep 27, 2024

bgctw commented Sep 27, 2024

torfjelde commented Sep 30, 2024

torfjelde left a comment •

edited

Loading

bgctw commented Sep 17, 2024 •

edited

Loading

coveralls commented Sep 17, 2024 •

edited

Loading

torfjelde left a comment •

edited

Loading

bgctw commented Sep 23, 2024 •

edited

Loading

bgctw commented Sep 24, 2024 •

edited

Loading

torfjelde commented Sep 26, 2024 •

edited

Loading

codecov bot commented Sep 26, 2024 •

edited

Loading

bgctw commented Sep 27, 2024 •

edited by torfjelde

Loading