Saturday, November 20, 2010

The struggle recapitulated

FIRST ATTEMPT

Note: I am talking to myself. Will move these to another place soon.



setenv USER_CC pgcc
setenv USER_FC mpif90
module load netcdf
setenv INC_NETCDF /opt/apps/pgi7_2/netcdf/3.6.2/include
setenv LIB_NETCDF /opt/apps/pgi7_2/netcdf/3.6.2/lib/
cd cam1/models/atm/cam/bld
./configure


Trying WITHOUT setting LIB_MPI or INC_MPI ; for reference they are in

MPICH_HOME=/opt/apps/pgi7_2/mvapich/1.0.1

Basis for not setting them:

INC_MPI
Directory containing the MPI include files. This is only required when CAM is built with SPMD enabled.


LIB_MPI
Directory containing the MPI library. This is only required when CAM is built with SPMD enabled.

and
-[no]spmd
-spmd enables an SPMD configuration of CAM (via MPI). -nospmd disables an SPMD configuration of CAM. SPMD is enabled by default only on IBM systems.

result

** Invalid build directory: /share/home/00671/tobis/CAM3Bld/cam1/models/atm/cam/bld
** The specified build directory is the same as the configuration script
** directory. This is not allowed because the Makefile produced by configure
** would overwrite the standard Makefile. Use a different build directory.


go back to CAM root


cd ~/CAM3Bld
mkdir build
cd build/
../cam1/models/atm/cam/bld/configure


No complaints from configure. This makes 6 files and an empty esmf directory. One of the files is a Makefile. So


make


fails.


cd /share/home/00671/tobis/CAM3Bld/cam1/models/utils/esmf; \
echo "Build the ESMF library."; \
echo "ESMF is NOT supported by the CCSM project, but by the ESMF core team in NCAR/SCD"; \
echo "See http://www.esmf.ucar.edu"; \
gmake -j 1 BOPT=O ESMF_BUILD=/share/home/00671/tobis/CAM3Bld/build/esmf ESMF_DIR=/share/home/00671/tobis/CAM3Bld/cam1/models/utils/esmf ESMF_ARCH=;
Build the ESMF library.
ESMF is NOT supported by the CCSM project, but by the ESMF core team in NCAR/SCD
See http://www.esmf.ucar.edu
gmake[1]: Entering directory `/share/home/00671/tobis/CAM3Bld/cam1/models/utils/esmf'

makefile:15: /share/home/00671/tobis/CAM3Bld/cam1/models/utils/esmf/build//base: No such file or directory
make[1]: *** No rule to make target `/share/home/00671/tobis/CAM3Bld/cam1/models/utils/esmf/build//base'. Stop.


OK, so a couple of clues here. ESMF_ARCH=; and share/home/00671/tobis/CAM3Bld/cam1/models/utils/esmf/build//base

So THE LIST OF CAM ENVIRONMENT VARIABLES EXCLUDES ESMF! (You have to guess that. Maybe "ESMF is NOT supported by the CCSM project, but by the ESMF core team in NCAR/SCD" is supposed tp be helpful.)Next, look in cam1/models/utils/esmf/build.

Here we see

alpha common_g config Darwin_xlf linux_altix linux_pathscale rs6000_sp
base_variables.defs common_O cray_x1 ES linux_gnupgf90 linux_pgi solaris
common common_variables cray_x1_ssp IRIX linux_intel README solaris_hpc
common_ conf.defs Darwin_absoft IRIX64 linux_lf95 rs6000_64 SX6

README is singularly unhelpful:
# $Id: README,v 1.1.8.1 2002/04/27 15:38:57 erik Exp $
The build directory contains all the base makefiles that are
included in your actual makefile. See the users manual for a
description of all the flags and rules in the makefiles.

It doesn't even give me a clue where to find "the user's manual". A person not up on NCAR politics might assume this was the CAM or CCSM manual.

So candidates in my case are linux_pgi and linux_gnupgf90. Looking inside the latter, it calls gnucc, which I think I don't want. so


set ESMF_ARCH linux_pgi
cd ~/cd CAM3Bld/
mv build buildx01
mkdir build
cd build
../cam1/models/atm/cam/bld/configure


The only diff at this point is two new files in the old directory left over from the failed buiuld, so the re-run of configure was not needed.

Make gets a lot further.

Now... gets past all the ESMF stuff (lots of it, which as we know has little purpose) and fails at


mpif90 -c -DHIDE_MPI /share/home/00671/tobis/CAM3Bld/cam1/models/atm/cam/src/control/string_utils.F90
mpif90 -c -DHIDE_MPI /share/home/00671/tobis/CAM3Bld/cam1/models/csm_share/shr/shr_kind_mod.F90
mpif90 -c -DHIDE_MPI /share/home/00671/tobis/CAM3Bld/cam1/models/csm_share/shr/shr_mpi_mod.F90
mpif90 -c -DHIDE_MPI /share/home/00671/tobis/CAM3Bld/cam1/models/csm_share/shr/shr_sys_mod.F90
mpif90 -c -DHIDE_MPI /share/home/00671/tobis/CAM3Bld/cam1/models/atm/cam/src/control/mpishorthand.F
PGF90-F-0226-Can't find include file misc.h (/share/home/00671/tobis/CAM3Bld/cam1/models/atm/cam/src/control/mpishorthand.F: 1)
PGF90/x86-64 Linux 7.2-5: compilation aborted


but, but, but misc.h is right there in the build directory. Why is PGF90 not looking in the current directory for includes?

Charles suggests using straight pgf90 rather than mpif90. This did nothing; then I thought to rerun configure. Haha! It built. First day out!

This is a serial CAM though. Not too much use. But interesting. if it can builf under gnu in serial, it could run on a Mac, couldn't it?

However, must rebuild for parallel run.

SECOND ATTEMPT


setenv USER_CC pgcc
setenv USER_FC mpif90
module load netcdf
setenv INC_NETCDF /opt/apps/pgi7_2/netcdf/3.6.2/include
setenv LIB_NETCDF /opt/apps/pgi7_2/netcdf/3.6.2/lib/
setenv INC_MPI
setenc LIB_MPI
mkdir buildpar
cd buildpar
../cam1/models/atm/cam/bld/configure -spmd


This fails because INC_MPI is being ignored, as well as the current working directory

The user guide says exatly nothing about command live invocation. Cannot find a PGI document that does. The man page says

Add directory to the compiler's search path for include files. For include files surrounded by < >, each -I
directory is searched followed by the standard area. For include files surrounded by " ", the directory containing
the file containing the #include directive is searched, followed by the -I directories, followed by the standard
area.


So is it


-I dir1 -I dir2
or
-I dir1 dir2
or
-Idir1:dir2
or what?

I already set MPI_INC per instructions. SO why is that ignored??

The makefile has

ifeq ($(SPMD),TRUE)
FFLAGS += -I$(INC_MPI)
LDFLAGS += -L$(LIB_MPI) -lmpi


AARGH NOW I AM DEBUGGING THE MAKEFILE


By the time the routine in question comes arounf the FFLAGS are lost.

I hate Makefiles anyway.

TAKE 3

# setenv USER_CC pgcc
# setenv USER_FC mpif90
module load netcdf
setenv INC_NETCDF /opt/apps/pgi7_2/netcdf/3.6.2/include
setenv LIB_NETCDF /opt/apps/pgi7_2/netcdf/3.6.2/lib/
#setenv INC_MPI
#setenv LIB_MPI
mkdir buildpar
cd buildpar
../cam1/models/atm/cam/bld/configure -spmd


** Cannot find mpif.h in specified directory: /usr/local/include
**


setenv USER_CC pgcc
setenv USER_FC mpif90
module load netcdf
setenv INC_NETCDF /opt/apps/pgi7_2/netcdf/3.6.2/include
setenv LIB_NETCDF /opt/apps/pgi7_2/netcdf/3.6.2/lib/
#setenv INC_MPI
#setenv LIB_MPI
mkdir buildpar
cd buildpar
../cam1/models/atm/cam/bld/configure -spmd


same


#setenv USER_CC pgcc
#setenv USER_FC mpif90
unsetenv USER_CC
unsetenv USER_FC
module load netcdf
setenv INC_NETCDF /opt/apps/pgi7_2/netcdf/3.6.2/include
setenv LIB_NETCDF /opt/apps/pgi7_2/netcdf/3.6.2/lib/
setenv INC_MPI /opt/apps/pgi7_2/mvapich/1.0.1/include
setenv LIB_MPI /opt/apps/pgi7_2/mvapich/1.0.1/lib
#unsetenv INC_MPI
#unsetenv LIB_MPI
mkdir buildpar
cd buildpar
../cam1/models/atm/cam/bld/configure -spmd


Builds but fails to link:


pgf90 -o /share/home/00671/tobis/CAM3Bld/buildpar/cam BalanceCheckMod.o BareGroundFluxesMod.o Biogeophysics1Mod.o Biogeophysics2Mod.o BiogeophysicsLakeMod.o CanopyFluxesMod.o DGVMAllocationMod.o DGVMEcosystemDynMod.o DGVMEstablishmentMod.o DGVMFireMod.o DGVMKillMod.o DGVMLightMod.o DGVMMod.o DGVMMortalityMod.o DGVMReproductionMod.o DGVMRestMod.o DGVMTurnoverMod.o DriverInitMod.o FracWetMod.o FrictionVelocityMod.o Hydrology1Mod.o Hydrology2Mod.o HydrologyLakeMod.o QSatMod.o RtmMod.o RunoffMod.o STATICEcosysDynMod.o SnowHydrologyMod.o SoilHydrologyMod.o SoilTemperatureMod.o SurfaceAlbedoMod.o SurfaceRadiationMod.o TridiagonalMod.o VOCEmissionMod.o abortutils.o acbnd.o accFldsMod.o accumulMod.o advnce.o aer_optics.o aerosol_intr.o albice.o albocean.o areaMod.o atm_lndMod.o atmdrvMod.o bandij.o basdy.o basdz.o basiy.o bilin.o binary_io.o bnddyi.o bndexch.o buffer.o caer.o caerbnd.o cam.o camice.o camoce.o carbon_intr.o carbonscales.o ccsm_msg.o check_energy.o chem_surfvals.o chemistry.o cldconst.o cldinti.o cldsav.o cldwat.o clm_csmMod.o clm_varcon.o clm_varctl.o clm_varpar.o clm_varsur.o clmtype.o clmtypeInitMod.o cloud_fraction.o cloudsimulator.o cmparray_mod.o comhd.o commap.o comspe.o comsrf.o comsrfdiag.o constituents.o controlMod.o convect_deep.o convect_shallow.o courlim.o cpslec.o cubxdr.o cubydr.o cubzdr.o dadadj.o datetime.o decompMod.o decompinit.o diag_dynvar_ic.o diagnostics.o difcor.o diffusion_solver.o dmsbnd.o do_close_dispose.o do_restwrite.o dp_coupling.o driver.o drydep_mod.o dust.o dust_intr.o dust_sediment_mod.o dycore.o dyn.o dyn_grid.o dynconst.o dyndrv.o dynpkg.o engy_tdif.o engy_te.o error_messages.o esinti.o extx.o extys.o extyv.o f_wrappers.o fft99.o filenames.o fileutils.o filterMod.o flxint.o flxoce.o gauaw_mod.o geopotential.o get_memusage.o getdatetime.o gffgch.o ghg_defaults.o gptl.o gptl_papi.o gptlutil.o grcalc.o grdxy.o grmult.o gw_drag.o hb_diff.o hdinti.o herxin.o heryin.o herzin.o histFileMod.o histFldsMod.o history.o hk_conv.o hordif.o hordif1.o hrintp.o hycoef.o icarus_scops.o ice_constants.o ice_data.o ice_dh.o ice_diagnostics.o ice_globalcalcs.o ice_kinds_mod.o ice_ocn_flux.o ice_sfc_flux.o ice_srf.o ice_tstm.o infnan.o iniTimeConst.o iniTimeVar.o inicFileMod.o inidat.o initGridCellsMod.o inital.o initcom.o initext.o initializeMod.o initindx.o inti.o intp_util.o ioFileMod.o iobinary.o iop.o kdpfnd.o lagyin.o lcbas.o lcdbas.o limdx.o limdy.o limdz.o linebuf_stdout.o linemsdyn.o lininterp.o lnd2atmMod.o lp_coupling.o marsaglia.o massfix.o mkglacier.o mkgridMod.o mklai.o mklanwat.o mkpft.o mkrank.o mksoicol.o mksoitex.o mksrfdatMod.o mkurban.o molec_diff.o mpiinc.o mpishorthand.o nanMod.o ncdio.o ncdio_atm.o omcalc.o ozone_data.o param_cldoptics.o pdelb0.o pft2colMod.o pftvarcon.o phcs.o phys_adiabatic.o phys_buffer.o phys_gmean.o phys_grid.o phys_idealized.o physconst.o physics_types.o physpkg.o pkg_cld_sediment.o pkg_cldoptics.o plevs0.o pmgrid.o ppgrid.o prescribed_aerosols.o print_coverage.o print_memusage.o prognostics.o program_csm.o program_off.o pspect.o qmassa.o qmassd.o qneg3.o qneg4.o quad.o quicksort.o rad_constituents.o radae.o radheat.o radiation.o radlw.o radsw.o ramp_scon.o readinitial.o realloc4.o realloc7.o reordp.o restFileMod.o restart.o restart_dynamics.o restart_physics.o rgrid.o rstwr.o rtcrate.o runtime_opts.o scan2.o scandyn.o scanslt.o scm0.o scyc.o seasalt_intr.o settau.o sgexx.o shr_alarm_mod.o shr_cal_mod.o shr_const_mod.o shr_date_mod.o shr_file_mod.o shr_kind_mod.o shr_mpi_mod.o shr_msg_mod.o shr_orb_mod.o shr_sys_mod.o shr_timer_mod.o shr_vmath_fwrap.o shr_vmath_mod.o snowdp2lev.o soxbnd.o spegrd.o spetru.o sphdep.o spmdGathScatMod.o spmdMod.o spmd_dyn.o spmd_phys.o spmd_utils.o spmdinit.o srchutil.o srfoce.o srfxfer.o sst_data.o stats.o stepon.o stratiform.o string_utils.o subgridAveMod.o sulbnd.o sulchem.o sulemis.o sulfur_intr.o surfFileMod.o swap_comm.o system_messages.o tfilt_massfix.o threadutil.o time_manager.o timeinterp.o tphysac.o tphysbc.o tphysidl.o tracers.o tracers_suite.o trb_mtn_stress.o trjmps.o trunc.o tsinti.o tstep.o units.o upper_bc.o vertical_diffusion.o vertinterp.o virtem.o volcanicmass.o volcemission.o volcrad.o vrtmap.o wetdep.o wrap_mpi.o wrap_nf.o wv_saturation.o xqmass.o zenith.o zm_conv.o -L/opt/apps/pgi7_2/netcdf/3.6.2/lib/ -lnetcdf -L/share/home/00671/tobis/CAM3Bld/buildpar/esmf/lib/libO/linux_pgi -lesmf -L/opt/apps/pgi7_2/mvapich/1.0.1/lib -lmpich

/opt/apps/pgi7_2/mvapich/1.0.1/lib/libmpich.a(dreg.o): In function `flush_dereg_mrs_external':
dreg.c:(.text+0x3b8): undefined reference to `ibv_dereg_mr'
/opt/apps/pgi7_2/mvapich/1.0.1/lib/libmpich.a(dreg.o): In function `dreg_new_entry':
dreg.c:(.text+0xa23): undefined reference to `ibv_reg_mr'
dreg.c:(.text+0xa4a): undefined reference to `ibv_reg_mr'
...

and many more, all related to links defined in mvapich


Now, is the linker actually looking in LIB_MPI ?

Have to cut and paste that mess into a file and grep for it. Sure is. Right at the end there.

No comments:

Post a Comment