In recent days, thanks to the summer break taken by many customers, I had the opportunity to test some new tools and products that I had always wanted to try but never had the chance to. One of the tools on my list was Migrate While Active, option 2 of the DB2 Mirror product (5770DBM).
There are some differences between option 1 (the real DB2 mirror) and option 2 (Migrate While Active or MWA), there are some differences both from a technological point of view (option 1 requires RDMA technology while option 2 only requires TCP connectivity between the two systems) and from the point of view of the product’s purpose (option 1 provides high reliability on two active nodes, option 2 is not a high-reliability tool, it is used to migrate systems).
Now, the curiosity about this tool stems from the fact that one of the tasks we perform daily within my organisation is the design and completion of migrations from customer data centres to our data centre, and as everyone can imagine, the less Mr Business stops, the happier customers are, especially in a complex phase such as the migration of a production system.
With the aim of reducing machine downtime, I jumped at the new methodology released with level 27 of the DB2 Mirror group for 7.4, which effectively uses data replication technologies managed directly by the operating system without going through a shutdown for global (or partial) machine backup. So let’s have a look at the requirements:
OS V7R4M0 (at the moment, this feature supposed to be planned in the autumn for newer releases)
Group SF99668 level 27
Empty target partition with same LIC level as production
Discrete amount of bandwidth to use for replication
Let’s get started! For my test, I cloned an internal production machine with a capacity of approximately 1.5TB of data, 1 Power 9 core, and 32GB of RAM. My goal was to replicate this machine from the primary datacentre to the datacentre we use for disaster recovery. The two datacentres are directly connected from a network perspective, so I had plenty of bandwidth available. Once the clone was created and updated with PTF, I was ready to start replicating. I also had to install another IBM i system with the *BASE option of product 5770DBM so that I could use the GUI.
Now, on the dr site, I created a partition of similar size and installed the LIC version that was also present on the production system (for this reason, I updated the ptfs on the initial clone, in order to use the latest LIC resave). Once the LIC installation is complete, we are ready to start the replication.
You need to access to the webgui (https://guiserver:2010/Db2Mirror) and also you need to click on “CONFIGURE NEW PAIR”, now we’ll choose Partition Mirroring:
Once checked Partition Mirroring, we need to insert information about source node, these credentials is used to check if all requirements are fulfilled:
In my case, LIC has been already installed on the target node so you can proceed to the Service Lan Adapter from DST (here the full guide). When the adapter configuration is ended we can proceed in the wizard putting target ip address:
As you understand, in the target node there is no OS, no TCP services, it is a low level communication. In fact in the next step you need to generate a key from the GUI that allows source and target communication, this key should be inserted in a DST macro (here the full documentation).
If you have already performed actions on the DST, we are able to start replication:
As you can imagine, at this point the sysbas synchronization phase begins, during which all data must be transferred and applied to the target replication system. As mentioned a few moments ago, this technology is only valid for sysbas; for iASP, the features already incorporated into PowerHA remain. The system replication is total, so it is not possible to partially replicate by indicating what to replicate and what not to replicate.
One of the real gems of this tool, however, is the ability to preview the consistency of the system. Once the systems are synchronized, it will be possible to activate a test mode in which replication will be suspended and both systems will track data modification activities: the source to know what to send to the target site and the target to know how to roll back the changes once testing is complete. The test mode will depend on the requirements of the various end users. It can be activated with a different network setup or with the same IP as production (clearly, at least the interface on the latter will have to be closed to avoid duplicate IPs). Once you are ready to perform the CUTOVER, disk writes to production will be suspended, pending transactions will be applied on the target, and then the target partition will be restarted with the specified network settings. In all phases you know how much time is required.
Unfortunately, this test did not provide me with any significant benchmarks. The machine was still a clone, and although the batches ran continuously, there were no interactive features or interfaces with other external systems, so the workload was not comparable to production. However, it was possible to migrate a system of approximately 1.5TB in about 6 hours. During those 6 hours of maximum priority replication (it is possible to specify priorities so as not to impact performance too much), the system was using 1 full core (Power9) and about 200mbps. In 6 hours, I would have been able to migrate my system to another datacentre, and the great thing is that all this without the initial downtime to make the backup… in fact, the only downtime is that of migration one. Clearly, this type of solution requires good computing power and bandwidth, which initially allows the complete system to be brought online and then keeps the two environments aligned.
And you, do you have opportunity to test this kind of software or related?
This week, I received a request from a customer to create a batch procedure that would restore files from the production environment to the test environment without transferring members due to disk space limitations. It seemed like a simple task, as the operating system allows this without any problems using the native RSTOBJ command thanks to the FILEMBR((*ALL *NONE)) parameter:
As mentioned above, it seemed simple, but the BRMS commands do not support this functionality. Let’s look at the RSTOBJBRM command, for example… If I try to pass it the parameter name, I get this error, precisely because the parameter does not exist in this case:
When talking to IBM support, I was told that the only solution at the moment to achieve my goal of restoring without members was to concatenate the native command with the information contained within the BRMS DB. This gave me the idea of creating a simple SQL procedure that would allow me to achieve my goal. Clearly, it was also feasible in other languages; I could have achieved the same result with an RPG programme. The choice of SQL was dictated by the need to find a quick alternative that did not require a great deal of development effort.
Let’s start with what we need… Let’s assume that the list of objects to be restored and the library in which they are located are passed as parameters, and let’s also assume that the library in which the restoration is to be performed is also passed as a parameter to the function. Now, what we need to calculate are the tape on which they are saved, the sequence (although we could use the *SEARCH parameter) and the device to be used for the restore.
Now, if you are using product 5770BR2 (with the most recent PTF), extracting this information is quite simple. In fact, there is a view in QUSRBRM called backup_history_object that returns information about the various saved objects. Alternatively, if you are using 5770BR1, you will need to query the QA1AHS file.
For example, we want to find media information for objects in QGPL (I will use QAPZCOVER as example)…
With 5570BR2 the SQL statement is: select VOLUME_SERIAL,DEVICE_NAMES, FILE_SEQUENCE_NUMBER from qusrbrm.backup_history_object WHERE SAVED_ITEM = 'QGPL' AND saved_object='QAPZCOVER' ORDER BY SAVE_TIMESTAMP DESC
If you are using 5770BR1: SELECT OBVOL AS VOLUME_SERIAL, OBHDEV AS DEVICE_NAMES, OBHSEQ AS FILE_SEQUENCE_NUMBER FROM QUSRBRM.QA1AOD where OBHLIB='QGPL' AND OBHNAM='QAPZCOVER' ORDER BY OBHDAT DESC
As you can see, the result is the same regardless of which of the two queries you use.
Now, in my case, I needed the most recent save, so I applied a LIMIT 1 in my function, sorting in descending order by save date (so I inevitably get the most recent save). If you also want to parameterise the date, you simply need to add a parameter to the procedure and add a condition to the WHERE clause.
Now we are ready to create our procedure: in the first stage we will create RSTOBJ command retrieving data from QUSRBRM, after that we will use SYSTOOLS.LPRINTF to write command executed on the joblog and after that we will execute command using QSYS2.QCMDEXC procedure. In my case, the RSTLIB parameter is optional, by defualt is *SAVLIB:
SET PATH "QSYS","QSYS2","SYSPROC","SYSIBMADM" ;
CREATE OR REPLACE PROCEDURE SQLTOOLS.RSTNOMBR2 (
IN OBJLIST VARCHAR(1000) ,
IN LIB VARCHAR(10) ,
IN RSTLIB VARCHAR(10) DEFAULT '*SAVLIB' )
LANGUAGE SQL
SPECIFIC SQLTOOLS.RSTNOMBR2
NOT DETERMINISTIC
MODIFIES SQL DATA
CALLED ON NULL INPUT
SET OPTION ALWBLK = *ALLREAD ,
ALWCPYDTA = *OPTIMIZE ,
COMMIT = *NONE ,
DECRESULT = (31, 31, 00) ,
DYNDFTCOL = *NO ,
DYNUSRPRF = *USER ,
SRTSEQ = *HEX
BEGIN
DECLARE CMD VARCHAR ( 10000 ) ;
SELECT
'RSTOBJ OBJ(' CONCAT TRIM ( OBJLIST ) CONCAT ') SAVLIB(' CONCAT TRIM ( LIB ) CONCAT ') DEV(' CONCAT TRIM ( OBHDEV ) CONCAT ') SEQNBR('
CONCAT OBHSEQ CONCAT ') VOL(' CONCAT TRIM ( OBVOL ) CONCAT ') ENDOPT(*UNLOAD) OBJTYPE(*ALL) OPTION(*ALL) MBROPT(*ALL) ALWOBJDIF(*COMPATIBLE) RSTLIB('
CONCAT TRIM ( RSTLIB ) CONCAT ') DFRID(Q1ARSTID) FILEMBR((*ALL *NONE))'
INTO CMD
FROM QUSRBRM.QA1AOD WHERE OBHLIB = TRIM ( LIB ) ORDER BY OBHDAT DESC LIMIT 1 ;
CALL SYSTOOLS . LPRINTF ( TRIM ( CMD ) ) ;
CALL QSYS2 . QCMDEXC ( TRIM ( CMD ) ) ;
END ;
Ok, when we have created this procedure we are also ready to test it… From 5250 screen you can use this command: RUNSQL SQL('call sqltools.rstnombr2(''QAPZCOVER'', ''QGPL'', ''MYLIB'')') COMMIT(*NONE)
This is the result:
If you put this command in a CL program you are able to perform this activity in a batch job. Also in this way you are able to make the restore from systems in the same BRMS network if they are sharing media information, in that case you should query QA1AHS instead because object detail is not shared.
Syslog, short for System Logging Protocol, is one of the cornerstones of modern IT infrastructures. Born in the early days of Unix systems, it has evolved into a standardized mechanism that enables devices and applications to send event and diagnostic messages to a central logging server. Its simplicity, flexibility, and widespread support make it indispensable across networks of any scale.
At its core, Syslog functions as a communication bridge between systems and administrators. It allows servers (also IBM i partitions), routers, switches, and even software applications to report what’s happening inside them—be it routine processes, configuration changes, warning alerts, or system failures. It is also possible that these messages are transmitted in real time to centralized collectors, allowing professionals to stay informed about what’s occurring in their environments without needing to inspect each machine individually.
This centralized approach is critical in environments that demand security and reliability. From banks to hospitals to government networks, organizations rely on Syslog not just for operational awareness but also for auditing and compliance. Log files generated by Syslog can help trace user activities and identify suspicious behavior or cyberattacks. That makes it an essential component in both reactive troubleshooting and proactive monitoring strategies.
So, in IBM i there al least three places in which you are able to generate Syslog.
The first place where you can extract syslog format is the system log. The QSYS2.HISTORY_LOG_INFO function allows you to extract output in this format. In my example, I want to highlight five restore operations performed today: SELECT syslog_facility, syslog_severity, syslog_event FROM TABLE (QSYS2.HISTORY_LOG_INFO(START_TIME => CURRENT DATE, GENERATE_SYSLOG =>'RFC3164' ) ) AS X where message_id='CPC3703' fetch first 5 rows only;
By changing the condition set in the where clause it is possible to work on other msgids that could be more significant, for example it is possible to log the specific msgid for abnormal job terminations (since auditors enjoy asking for extraction on error batches).
The second tool that could be very useful is the analysis of journals with syslog, in fact the QSYS2.DISPLAY_JOURNAL function also allows you to generate output in syslog format. In my example, I extracted all audit journal entries (QSYS/QAUDJRN) that indicated the deletion operation of an object on the system (DO entry type): SELECT syslog_facility, syslog_severity, syslog_event FROM TABLE (QSYS2.DISPLAY_JOURNAL('QSYS', 'QAUDJRN',GENERATE_SYSLOG =>'RFC5424') ) AS X WHERE syslog_event IS NOT NULL and JOURNAL_ENTRY_TYPE='DO';
Of course, it is possible to extract entries for any type of journal, including application journals.
The last place that comes to mind is the system’s syslog service log file. In a previous article, we saw how this service could be used to log SSH activity. In my case the log file is located under /var/log/messages, so with the QSYS2.IFS_READ function I can read it easily: SELECT * FROM TABLE(QSYS2.IFS_READ(PATH_NAME => '/var/log/messages'));
These are just starting points… as mentioned previously, these entries are very important for monitoring events that occur on systems. Having them logged and stored in a single repository for other platforms can make a difference in managing attacks or system incidents in general.
Do you use these features to monitor and manage events on your systems?
One of the tools that I still see underutilized on IBM i systems is function usage. Essentially, it is a tool that can centrally manage permissions and authorizations for the use of certain operating system functions. It is a very powerful tool in itself and must be handled with care. One of the use cases I have seen, for example, is the ability to inhibit access to remote database connections without writing a single line of code by working on user profiles or group profiles.
To view the current status, you can use Navigator or invoke the WRKFCNUSG command. Alternatively, there is a system view that will show the same configuration, you can easily query with: SELECT * FROM QSYS2.FUNCTION_USAGE.
In this window now you are able to see the current configuration of your system:
Now, since you are accessing Navigator, you are encountering a function usage. In fact, there is QIBM_NAV_ALL_FUNCTION, which establishes the access policies of various users to Navigator functions. By default, this function usage is set to prevent all users from using it, while users with *ALLOBJ authorization can use it.
This is because function usage has different levels of authorization: the default authorization that applies to all users, authorization for users with *ALLOBJ, and finally explicit authorizations that can be applied to individual profiles or individual group profiles.
When we talk about function usage, my advice is to choose the approach you want to follow and start applying it to the various components that may be affected by these changes. Let me explain: generally speaking, when it comes to security, there are two approaches: the first allows everything to everyone except those who have expressly denied authorization, while the second denies everything to everyone except certain explicitly authorized users. Obviously, I personally prefer the second approach, but it requires a more in-depth analysis and risk assessment.
Speaking of function usage, in addition to managing permissions on Navigator, it is also possible to manage permissions on BRMS (if installed) and on some TCP/IP servers, which we will now look at.
For example, let’s assume we want to block database connections (QZDASOINIT or DDM/DRDA connections). The strategy is to block access to all users, without distinguishing between *ALLOBJ and non-ALLOBJ, authorizing specific individual users. In this case you need to edit QIBM_DB_ZDA and QIBM_DB_DDMDRDA.
So, as I said in the rows above, we have set DENIED as default and Not Used (is like DENIED) for *ALLOBJ users. Here there is a list of users that is authorized.
On IBM i systems, the SSH service is playing an important role in modernisation, it can be used, for instance, to take advantage of new software development tools such as VS Code For I, or it can be used in an innovative software release context using pipelines. SSH (or rather SFTP) is also playing a key role in securing data exchange flows by gradually replacing the plain-text transfers that used to use the FTP protocol, popular in the IBM i context.
At the moment SSHD server doesn’t have any kind of exit point that we can use in order to restrict or manage connections to this server… This doesn’t mean that is not possibile to make this server secure! In this article we will show how to restrict access to specific users (or groups of users) and log the access attempts that are made.
What do we need to know? Well, the SSHD server has the same behavior that it has on other platforms and therefore allows you to use the same directives, so if you are familiar with some other UNIX like platform, well in this case you won’t have any kind of problem. As far as logging is concerned, again we will use a very convenient and widely used utility on UNIX systems namely syslogd.
How to configure and activate SysLogD?
This service is automatically installed with the 5733SC1 operating system product. Activating the daemon is quite simple, you only need to submit a job that activates it as per this command: SBMJOB CMD(STRQSH CMD(‘/QOpenSys/usr/sbin/syslogd’)) JOB(SYSLOGD) JOBQ(QSYSNOMAX) (P.S. you need to put this command into you QSTRUP)
To check that’s everything ok, you need to look in your NETSTAT opt. 3 and in this way you need to find the UDP port 514 in listening status.
So, now that the deamon is active, you need only to change your SSHD configuration file in order to send to syslog server all entries:
Restart sshd server and check into /var/log/messages or /var/log/auth files
How to restrict access to ssh?
The logic behind the configuration of user restriction in ssh can be bi-directional, i.e. defining a list of users who are not authorised to connect and consequently all the others are, or defining the list of users who are authorised and the others are not. In my case, the choice falls on the second possibility by authorising access to this service to restricted groups of users.
More and more frequently customers are reporting to us that after upgrading to Windows 11 24H2 there are problems with connecting to network shares via Netserver.
Small parenthesis, in itself, the fact that the IFS of an IBM i system is accessible is a great thing, but you have to be extremely careful about what you share and with what permissions you do it. Frequently I see that on systems there is root ( / ) shared read and write, this is very dangerous because in addition to IFS you can browse the other file systems on our systems such as QSYS.LIB and QDLS. So try if possible to share as little as possible with as low permissions as possible. Closing parenthesis.
Returning to the initial issue, indeed it seems that Microsoft with its update (now released a few months ago) has added issues related to the support of certain characters in IFS filenames. Thus, if a folder contains a file with the name consisting of one of the offending special characters, Windows loses access to that folder. The characters that generate these problems are the following: < (less than)
(greater than)
: (colon)
“ (double quote)
/ (forward slash)
\ (backslash)
| (vertical bar or pipe)
? (question mark)
* (asterisk)
Here, as indicated in this IBM documentation link, changing the file names by removing the famous characters will restore access to shared folders. Now, clearly in a production context it is imaginable that there are several shared folders and that IFS is an infinitely large place with infinite files (most of the time abandoned :-D), so it is necessary to find a clever way to check in which shared folders we might have problems. To do this we will rely on two SQL views, the first we need to list the list of folders we are sharing, the second we need to list the paths that contain special characters inside them.
Thanks to the QSYS2.SERVER_SHARE_INFO view, we will have the ability to list the paths that have been shared via netserver with the following query:
select PATH_NAME from QSYS2.SERVER_SHARE_INFO where PATH_NAME is not null
Now that we have the list of all directories shared, we only need to scan the content. Now that we have the list of all shared directories, we just need to analyze the contents. To do this we will use the procedure QSYS2.IFS_OBJECT_STATISTICS which takes as parameters the name of the starting path, any paths to be excluded and an indication of whether to proceed with scanning in the subdirectories, clearly in our case we will tell it to scan those as well. Now, we are not interested in taking all the files, but only those that contain special characters in their name that are not supported by Windows, for which we will apply a WHERE. Here is an example of the query on a small path (take care that this query could run for a lot of time):
SELECT PATH_NAME,CREATE_TIMESTAMP,ACCESS_TIMESTAMP,OBJECT_OWNER
FROM TABLE (
QSYS2.IFS_OBJECT_STATISTICS(
START_PATH_NAME => '/qibm/ProdData/Access/ACS/Base',
OMIT_LIST => '/QSYS.LIB /QNTC /QFILESVR.400',
SUBTREE_DIRECTORIES => 'YES')
)
WHERE PATH_NAME LIKE '%\%'
OR PATH_NAME LIKE '%<%'
OR PATH_NAME LIKE '%>%'
OR PATH_NAME LIKE '%|%'
OR PATH_NAME LIKE '%*%'
OR PATH_NAME LIKE '%:%'
OR PATH_NAME LIKE '%?%'
In my example I took a fairly small path (the one with the ACS installer) and it took a short time. Moreover, no file contains any wrong characters so I can rest assured, in fact it did not return any rows.
At this point, there is nothing left to do but combine the two queries into a very simple RPG program. Now, considering that the second scan query can take a long time, it is a good idea to submit its execution, saving the results in another table.
As you can see, my program is pretty short, only combining two easy queries, and in this way you are able to find every file that will break shares. At the end of the execution, please check MYSTUFF/SHARECHAR file, here you can find details about file as path name, owner, creation and last access timestamp.
Remember, this is SQL, so you can also change whathever you want such as column, destination file and so on.
I hope I give you a way to save you time with this that can be a rather insidious and annoying problem.