Addressing TODOs for Instance struct#533
Conversation
- nmodl.yaml file is more for language constructs
- InstanceStruct is specific for code generation and hence
move it to codegen.yaml
- instance structure now contains all global variables - instance structure now contains index variables for ions - nrn_state kernel now has all variables converted to instance - InstanceVarHelper added to query variable and it's location nmodl_to_json helper added in main.cpp added --vector-width CLI option
|
Can one of the admins verify this patch? |
8b1d186 to
b9b2061
Compare
|
@georgemitenkov : I push the change to fix the gitlab CI. But it seems like there are some minor conflicts because earlier today I push forced few things to my PR. Conflicts are minor and you can fix it. As we have dependent PRs : I think we can merge this your #533 PR into my #531 and you continue working in my branch pramodk/llvm-instance-type. As I can't merge my PR into llvm branch without your PR, this will avoid juggling changes and conflicts. |
Great! I will make the changes.
Sure, I am nearly done with simple vector code generation, so I will make this PR ready by tonight. |
Ok great! feel free to change my or @alkino's PRs to prepare one final PR ready for merge. Then I will do final check and we can merge it. In the meantime I will start sketching something for testing aspect. And for gitlab CI failures: It's just cmake format and clang format stuff. Tests are passing so its good! |
|
@pramodk Regarding testing, I had one idea:
This is not actual IR check but suits integration test purposes. |
|
For example something like this: I am currently using this to verify the vectorised code. |
…m/BlueBrain/nmodl into georgemitenkov/llvm-instance-type
|
@pramodk @iomaganaris This PR is now ready! Now, we fully support scalar code kernel generation + compute (i.e. non-scatter/gather/control flow) vectorised code generation. There are no tests at the moment, but with the issue just opened I think these can be added later. At the moment, I have tested by comparing to LLVM generated from C++ and its manually defined vector version. See example below :) |
|
Scalar code is supported fully (running VOID nrn_state_hh(INSTANCE_STRUCT *mech){
INTEGER id
for(id = 0; id<mech->node_count; id = id+4) {
INTEGER node_id
DOUBLE v
mech->m[id] = mech->m[id]+(1.0-exp(mech->dt*((((-1.0)))/mech->mtau[id])))*(-(((mech->minf[id]))/mech->mtau[id])/((((-1.0)))/mech->mtau[id])-mech->m[id])
mech->h[id] = mech->h[id]+(1.0-exp(mech->dt*((((-1.0)))/mech->htau[id])))*(-(((mech->hinf[id]))/mech->htau[id])/((((-1.0)))/mech->htau[id])-mech->h[id])
mech->n[id] = mech->n[id]+(1.0-exp(mech->dt*((((-1.0)))/mech->ntau[id])))*(-(((mech->ninf[id]))/mech->ntau[id])/((((-1.0)))/mech->ntau[id])-mech->n[id])
}
}The output for define void @nrn_state_hh(%hh__instance_var__type* %mech1) {
%mech = alloca %hh__instance_var__type*, align 8
store %hh__instance_var__type* %mech1, %hh__instance_var__type** %mech, align 8
%id = alloca i32, align 4
store i32 0, i32* %id, align 4
// These 2 lines are not really needed atm.
%__vec_id = alloca <4 x i32>, align 16
store <4 x i32> <i32 0, i32 1, i32 2, i32 3>, <4 x i32>* %__vec_id, align 16
br label %for.cond
for.cond: ; preds = %for.inc, %0
%1 = load %hh__instance_var__type*, %hh__instance_var__type** %mech, align 8
%2 = getelementptr inbounds %hh__instance_var__type, %hh__instance_var__type* %1, i32 0, i32 43
%3 = load i32, i32* %2, align 4
%4 = load i32, i32* %id, align 4
%5 = icmp slt i32 %4, %3
br i1 %5, label %for.body, label %for.exit
for.body: ; preds = %for.cond
%node_id = alloca i32, align 4
%v = alloca double, align 8
// Getting the pointer to double* m
%6 = load %hh__instance_var__type*, %hh__instance_var__type** %mech, align 8
%7 = getelementptr inbounds %hh__instance_var__type, %hh__instance_var__type* %6, i32 0, i32 13
%8 = load i32, i32* %id, align 4
%9 = sext i32 %8 to i64
%10 = load double*, double** %7, align 8
// A simple bitcast is sufficient since the data is consecutive!
%11 = bitcast double* %10 to <4 x double>*
// Load the actual values to the vector, then we proceed with other computations.
%12 = getelementptr inbounds <4 x double>, <4 x double>* %11, i64 %9
%13 = load <4 x double>, <4 x double>* %12, align 32
// Similar instructions for all ...[id], etc.
... |
pramodk
left a comment
There was a problem hiding this comment.
LGTM - I added some minor comments.
|
|
@georgemitenkov : this is ready to merge in my PR right? Just asking because there is one todo in the description . |
|
@pramodk It is ready for the merge. I have compared the LLVM output manually, and it looks correct. Since we do not know how to test it properly yet, the tests can be added later separately. |
- remove undefined visit_codegen_instance_var - Improved member creation for instance struct - Instance struct type generation for kernel arguments - Proper integration of instance struct - Added scalar code generation for the kernel - Removed instance test since it is not created explicitly anymore - Fixed ordering for precision and width in LLVM Visitor - Added vector induction variable - Vectorised code for compute with direct loads fully functional - Instance naming fixed
- remove undefined visit_codegen_instance_var
- Improved member creation for instance struct
- Instance struct type generation for kernel arguments
- Proper integration of instance struct
- Added scalar code generation for the kernel
- Removed instance test since it is not created explicitly anymore
- Fixed ordering for precision and width in LLVM Visitor
- Added vector induction variable
- Vectorised code for compute with direct loads fully functional
- Instance naming fixed
- (LLVM IR) Fixed compute vector code generation types
- refactoring : improve coversion of double to int for
the loop expressions
- remove undefined visit_codegen_instance_var
- Improved member creation for instance struct
- Instance struct type generation for kernel arguments
- Proper integration of instance struct
- Added scalar code generation for the kernel
- Removed instance test since it is not created explicitly anymore
- Fixed ordering for precision and width in LLVM Visitor
- Added vector induction variable
- Vectorised code for compute with direct loads fully functional
- Instance naming fixed
- (LLVM IR) Fixed compute vector code generation types
- refactoring : improve coversion of double to int for
the loop expressions
- remove undefined visit_codegen_instance_var
- Improved member creation for instance struct
- Instance struct type generation for kernel arguments
- Proper integration of instance struct
- Added scalar code generation for the kernel
- Removed instance test since it is not created explicitly anymore
- Fixed ordering for precision and width in LLVM Visitor
- Added vector induction variable
- Vectorised code for compute with direct loads fully functional
- Instance naming fixed
- (LLVM IR) Fixed compute vector code generation types
- refactoring : improve coversion of double to int for
the loop expressions
- remove undefined visit_codegen_instance_var
- Improved member creation for instance struct
- Instance struct type generation for kernel arguments
- Proper integration of instance struct
- Added scalar code generation for the kernel
- Removed instance test since it is not created explicitly anymore
- Fixed ordering for precision and width in LLVM Visitor
- Added vector induction variable
- Vectorised code for compute with direct loads fully functional
- Instance naming fixed
- (LLVM IR) Fixed compute vector code generation types
- refactoring : improve coversion of double to int for
the loop expressions
- remove undefined visit_codegen_instance_var
- Improved member creation for instance struct
- Instance struct type generation for kernel arguments
- Proper integration of instance struct
- Added scalar code generation for the kernel
- Removed instance test since it is not created explicitly anymore
- Fixed ordering for precision and width in LLVM Visitor
- Added vector induction variable
- Vectorised code for compute with direct loads fully functional
- Instance naming fixed
- (LLVM IR) Fixed compute vector code generation types
- refactoring : improve coversion of double to int for
the loop expressions
- remove undefined visit_codegen_instance_var
- Improved member creation for instance struct
- Instance struct type generation for kernel arguments
- Proper integration of instance struct
- Added scalar code generation for the kernel
- Removed instance test since it is not created explicitly anymore
- Fixed ordering for precision and width in LLVM Visitor
- Added vector induction variable
- Vectorised code for compute with direct loads fully functional
- Instance naming fixed
- (LLVM IR) Fixed compute vector code generation types
- refactoring : improve coversion of double to int for
the loop expressions
- remove undefined visit_codegen_instance_var
- Improved member creation for instance struct
- Instance struct type generation for kernel arguments
- Proper integration of instance struct
- Added scalar code generation for the kernel
- Removed instance test since it is not created explicitly anymore
- Fixed ordering for precision and width in LLVM Visitor
- Added vector induction variable
- Vectorised code for compute with direct loads fully functional
- Instance naming fixed
- (LLVM IR) Fixed compute vector code generation types
- refactoring : improve coversion of double to int for
the loop expressions
- remove undefined visit_codegen_instance_var
- Improved member creation for instance struct
- Instance struct type generation for kernel arguments
- Proper integration of instance struct
- Added scalar code generation for the kernel
- Removed instance test since it is not created explicitly anymore
- Fixed ordering for precision and width in LLVM Visitor
- Added vector induction variable
- Vectorised code for compute with direct loads fully functional
- Instance naming fixed
- (LLVM IR) Fixed compute vector code generation types
- refactoring : improve coversion of double to int for
the loop expressions

This PR addresses certain TODOs from 2da2bc4 and would help with instance struct integration (see main PR)
Scalar integration:
CodegenVarTypevisitornrn_stategenerationSimple direct index vectorization:
Other:
- [ ] Update tests to reflect the changes and test the kernelWill be done in separate PR