PDA

View Full Version : Stumped by ACCESS_VIOLATION error


ErwinJ
12-03-2003, 01:21 PM
Hello everyone,

I'm currently working on a program that fails to run, due to a very mysterious EXCEPTION_ACCESS_VIOLATION error.

The error occurs at the end of a subroutine (exactly at the 'end subroutine statement). At runtime the error message is:

jwe0019i-u The program was terminated abnormally with Exception Code EXCEPTION_ACCESS_VIOLATION.
Error occurs at or near line 1572 of _windig@_reco_haplo_miss_
Called from or near line 370 of _windig@_sortmarkers_
Called from or near line 112 of _windig@_windighap_
Called from or near line 35 of _MAIN__
error summary (Fortran)
error number error level error count
jwe0019i u 1
total error count = 1

When running through the Fijitsu debugger, at that point (line 1572) get several messages:

EXCEPTION_ACCES_VIOLATION_ERROR (0xc0000005): jwe_xdal()

:::0x951bf0:1572 contains invalid path
(this is a translation from the orignal dutch message, though)

and

Cannot open file :::0x951bf0:1572

That last one puzzled me, because there are no file or data edit statements in that subroutine.

Because the error occurs at the point where control is handed back from the subroutine to the routine calling it, I though maybe there's something wrong with the interface. But that all checks out (same attributes, dimensions).

Now, I'm compiling and running the program on a Win2000 machine (Lahey-Fujitsu Fortran95 v5.5j). I've read somewhere, this error could have something to do with trying to access a protected memory adress or some such. Does anyone know a workaround or a options switch or any other solution? I'm really stumped...

tzeis
12-03-2003, 05:33 PM
In cases like this, the -chk and -chkglobal options are your best friend. They should detect the source of the problem, instead of just teling you where the consequences are occurring. Remember that if you are going to use -chkglobal, you will have to compile the entire program with the option.

ErwinJ
12-04-2003, 09:56 AM
Thanks, but I already did that and it isn't telling me squat.

I went through the entire program one step at a time and there just isn't anything which would indicate something amiss untill I get to the 'end subroutine' statement, when the program comes crashing down.

What's more, even with the -chk or -chkglobal enabled there is no indication where the illegal write (if that is what is happening) occurs to. The only other clue is in the Input Command Log, where at the crash the report: 'Signaled EXCEPTION_ACCESS_VIOLATION at Process ID:0x51c, Thread ID:0x6b4' occurs. That really doesn't help me very much.

Any other suggestions?

[edit] We just tried to compile the program on an older version of Fortran (Lahey Fortran90) and the damn thing runs fine, even on my machine :confused: . Are there issues with the F95 compiler I should know about?

tzeis
12-04-2003, 05:24 PM
When an access violation occurs and -chk doesn't catch it, the problem is almost always due to a pointer fault of some kind, like a dangling pointer. The fact that it appears to run ok with another compiler does not necessarily mean that you don't have a problem in your code. If you want to show your code, maybe I can be more specific...

ErwinJ
12-05-2003, 10:30 AM
If you want to show your code, maybe I can be more specific...

Thanks for the assistance, but I couldn't do that. First, it's a module of some 3000 lines of code, in which a set of variables is passed on to a string of 7 or 8 subroutines. It took me 3 months just to get some understanding of that module alone (it's not mine, I didn't write it). Second, it's part of an R&D development project and therefor publicizing the code is a tad sensitive.

The code was originally developed in Fortran90 and needs updating to be made workable with other code we're developing.

When you say dangling pointer, what do you actually mean? Could it be that I should check for some allocatable variables that didn't get deallocated or some such?

BTW, thanks for the help sofar. Didn't solve the problem (yet) but at least I'm learning...:D

I think it may be a win2000 issue along with any issues the code itself may have. Compiling the code on the win2000 machine with the F90 compiler I get a number of Memory Protection Faults at the point where the modules are linked. Can anyone explain what that is all about?

tzeis
12-08-2003, 05:34 PM
A dangling pointer occurs when the thing that a pointer is pointing at becomes undefined. Except in an extremely limited set of circumstances, an undefined pointer must not be referred to, any reference can cause unpredictable results. In this context. "unpredictable" can range from the program apparently executing successfully, to executing but getting wrong answers, to crashing. Dangling pointers can be extremely difficult to detect, and in fact, the Fortran standard explicitly states that those implementing the Fortran standard have no obligation to detect a pointer in a dangling condition.

All pointers are initially undefined, and referring to them before their status has been defined is taboo. The best way to define the status of a pointer is to nullify it at the time it is declared. For example:


real, pointer :: a(:)
real, pointer :: b(:) => null()


The status of "a" is undefined, and it must be defined before it can be referred to, or tested for association status. The status of "b" is defined, and "b" has a "disassociated" status.

One thing that might cause a dangling pointer is to point to an allocated variable, and then to deallocate the allocatable variable without nullifying or reassigning the pointer. The pointer variable then has an undefined status, i.e. it is left "dangling".

Another thing that might cause problems is to assign a pointer dummy variable to a local variable in a subprogram. On exit from the subprogram, the local variable might become undefined, and leave the pointer dangling.

Hopefully, this will give you some idea of where to start looking to chase the problem down.