Thursday, 29 January 2015

Unknown authentication algorithm : [SHA]

Still working on my Oracle Enterprise Manager 12c Plugin and hacking because of the lack of adequate documentation.  After successfully deployed my plugin I made the fatal mistake of exercising the functionality.  My plugin uses SNMP to collect its metrics ad as we all know SNMPv1 is not very secure, in fact it is fast being outlawed in many data centers for obvious reasons.

Being security conscious, as every hacker should be, I thought I'd configure my target to SNMPv3, that was my second mistake with the plugin development.  All of a sudden my agent log files were engulfed with those ever so helpful java exceptions.  Okay so they are always engulfed with java exceptions but these are the ones you're interested in today.
java.util.concurrent.ExecutionException: java.lang.IllegalArgumentException: Unknown authentication algorithm : [SHA]
at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:232)
at java.util.concurrent.FutureTask.get(FutureTask.java:91)
at oracle.sysman.gcagent.task.TaskFutureImpl.get(TaskFutureImpl.java:320)
at oracle.sysman.gcagent.metadata.impl.collection.CollectionItem$CollectionItemTask.getSubTaskFuture(CollectionItem.java:2850)
at oracle.sysman.gcagent.metadata.impl.collection.CollectionItem$CollectionItemTask.uploadAtomic(CollectionItem.java:3040)
at oracle.sysman.gcagent.metadata.impl.collection.CollectionItem$CollectionItemTask.run(CollectionItem.java:2549)
at oracle.sysman.gcagent.task.AbstractTemplateTask.call(AbstractTemplateTask.java:198)
at oracle.sysman.gcagent.task.AbstractTemplateTask.call(AbstractTemplateTask.java:49)
at oracle.sysman.gcagent.task.scheduler.DispatchingTaskScheduler$ReschedulingHelper$ReschedulingTask.call(DispatchingTaskScheduler.java:439)
at oracle.sysman.gcagent.task.scheduler.DispatchingTaskScheduler$ReschedulingHelper$ReschedulingTask.call(DispatchingTaskScheduler.java:401)
at oracle.sysman.gcagent.task.executor.DiagWrappedTask.call(DiagWrappedTask.java:60)
at oracle.sysman.gcagent.task.TaskFutureImpl$WrappedTask.accountedCall(TaskFutureImpl.java:599)
at oracle.sysman.gcagent.task.TaskFutureImpl$WrappedTask.call(TaskFutureImpl.java:643)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
at java.util.concurrent.FutureTask.run(FutureTask.java:138)
at oracle.sysman.gcagent.task.TaskFutureImpl.run1(TaskFutureImpl.java:380)
at oracle.sysman.gcagent.task.TaskFutureImpl.run(TaskFutureImpl.java:337)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918)
at oracle.sysman.gcagent.task.executor.TrackThreadFactory$1.run(TrackThreadFactory.java:54)
at oracle.sysman.gcagent.util.system.GCAThread$RunnableWrapper.run(GCAThread.java:189)
at java.lang.Thread.run(Thread.java:662)
Caused by: java.lang.IllegalArgumentException: Unknown authentication algorithm : [SHA]
at com.sun.management.snmp.usm.SnmpUsmPasswordLcd.translateAndInsert(SnmpUsmPasswordLcd.java:742)
at com.sun.management.snmp.usm.SnmpUsmPasswordLcd.addUser(SnmpUsmPasswordLcd.java:371)
at oracle.sysman.gcagent.addon.fetchlet.snmp.SNMPFetchlet.discoverTarget(SNMPFetchlet.java:1455)
at oracle.sysman.gcagent.addon.fetchlet.snmp.SNMPFetchlet.processV3Request(SNMPFetchlet.java:1318)
at oracle.sysman.gcagent.addon.fetchlet.snmp.SNMPFetchlet.processRequests(SNMPFetchlet.java:1070)
at oracle.sysman.gcagent.addon.fetchlet.snmp.SNMPFetchlet.getMetric(SNMPFetchlet.java:955)
at oracle.sysman.gcagent.target.interaction.execution.FetchletFactory.getMetric(FetchletFactory.java:427)
at oracle.sysman.gcagent.target.interaction.execution.ExecuteTask.executeQueryDescriptor(ExecuteTask.java:1050)
at oracle.sysman.gcagent.target.interaction.execution.ExecuteTask.runTask(ExecuteTask.java:3905)
at oracle.sysman.gcagent.target.interaction.execution.ExecuteTask.call(ExecuteTask.java:5098)
at oracle.sysman.gcagent.metadata.impl.collection.MetricColl$1.call(MetricColl.java:552)
at oracle.sysman.gcagent.metadata.impl.collection.MetricColl$1.call(MetricColl.java:513)
at oracle.sysman.gcagent.task.TaskFutureImpl$WrappedTask.accountedCall(TaskFutureImpl.java:599)
at oracle.sysman.gcagent.task.TaskFutureImpl$WrappedTask.call(TaskFutureImpl.java:643)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
at java.util.concurrent.FutureTask.run(FutureTask.java:138)
at oracle.sysman.gcagent.task.TaskFutureImpl.run1(TaskFutureImpl.java:380)
at oracle.sysman.gcagent.task.TaskFutureImpl.run(TaskFutureImpl.java:337)
at oracle.sysman.gcagent.task.CompositeTask.runSubtask(CompositeTask.java:70)
at oracle.sysman.gcagent.task.CompositeTask.run(CompositeTask.java:77)
at oracle.sysman.gcagent.metadata.impl.collection.CollectionItem$CollectionItemTask.run(CollectionItem.java:2536)
... 16 more
So the background to this is that when a metric definition is defined with "Snmp", and yes that is a deliberate incorrect capitalisation of the acronym SNMP, see my other post "Could not create instance of fetchlet : SNMP", you should also define the target with monitoring credentials, which I did.  I have allowed users to choose both SNMPv1Creds and SNMPv3Creds. The SNMPv3Creds allows you to choose and encryption algorithm of {blank}, [SHA] or [MD5] for the authPwd and privPwd parameters, it doesn't matter what you choose you will still see the exception above.  In the case of [MD5] it will start;
java.util.concurrent.ExecutionException: java.lang.IllegalArgumentException: Unknown authentication algorithm : [MD5]
I eventually managed to track this bug 19081194 - SNMP CREDENTIALS UI AND FETCHLET/RECEIVELET DISAGREE ABOUT PROTOCOL NAMES, unfortunately at the time of writing I'm still waiting for a fix.

Could not create instance of fetchlet : SNMP

Hopefully this post and the others like it will save others the numerous days I spent hacking around in the dark trying to find a solutions for developing my Oracle Enterprise Manager 12c Plugin.

My plugin collects it's metric data using SNMP. No prizes for guessing, but after the first time I deployed my plugin it didn't work. When you define your metrics you get to choose a fetchlet type, so naturally I chose SNMP. Whenever any of my metric collections ran I would get the error;
oracle.sysman.emSDK.agent.fetchlet.exception.UnknownFetchletException: Could not create instance of fetchlet : SNMP
at oracle.sysman.gcagent.target.interaction.execution.FetchletFactory.getItem(FetchletFactory.java:308)
at oracle.sysman.gcagent.target.interaction.execution.FetchletFactory.loadFetchlet(FetchletFactory.java:317)
at oracle.sysman.gcagent.target.interaction.execution.FetchletFactory.<init>(FetchletFactory.java:207)
at oracle.sysman.gcagent.target.interaction.execution.ExecuteTask.runTask(ExecuteTask.java:3900)
at oracle.sysman.gcagent.target.interaction.execution.ExecuteTask.call(ExecuteTask.java:5098)
at oracle.sysman.gcagent.metadata.impl.collection.MetricColl$1.call(MetricColl.java:552)
at oracle.sysman.gcagent.metadata.impl.collection.MetricColl$1.call(MetricColl.java:513)
at oracle.sysman.gcagent.task.TaskFutureImpl$WrappedTask.accountedCall(TaskFutureImpl.java:599)
at oracle.sysman.gcagent.task.TaskFutureImpl$WrappedTask.call(TaskFutureImpl.java:643)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
at java.util.concurrent.FutureTask.run(FutureTask.java:138)
at oracle.sysman.gcagent.task.TaskFutureImpl.run1(TaskFutureImpl.java:380)
at oracle.sysman.gcagent.task.TaskFutureImpl.run(TaskFutureImpl.java:337)
at oracle.sysman.gcagent.task.CompositeTask.runSubtask(CompositeTask.java:70)
at oracle.sysman.gcagent.task.CompositeTask.run(CompositeTask.java:77)
at oracle.sysman.gcagent.metadata.impl.collection.CollectionItem$CollectionItemTask.run(CollectionItem.java:2536)
at oracle.sysman.gcagent.task.AbstractTemplateTask.call(AbstractTemplateTask.java:198)
at oracle.sysman.gcagent.task.AbstractTemplateTask.call(AbstractTemplateTask.java:49)
at oracle.sysman.gcagent.task.executor.DiagWrappedTask.call(DiagWrappedTask.java:60)
at oracle.sysman.gcagent.task.TaskFutureImpl$WrappedTask.accountedCall(TaskFutureImpl.java:599)
at oracle.sysman.gcagent.task.TaskFutureImpl$WrappedTask.call(TaskFutureImpl.java:643)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
at java.util.concurrent.FutureTask.run(FutureTask.java:138)
at oracle.sysman.gcagent.task.TaskFutureImpl.run1(TaskFutureImpl.java:380)
at oracle.sysman.gcagent.task.TaskFutureImpl.run(TaskFutureImpl.java:337)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918)
at oracle.sysman.gcagent.task.executor.TrackThreadFactory$1.run(TrackThreadFactory.java:54)
at oracle.sysman.gcagent.util.system.GCAThread$RunnableWrapper.run(GCAThread.java:189)
at java.lang.Thread.run(Thread.java:662)
I spent ages trying to figure out what the hell I'd done wrong.  Rather unsurprisingly after several hours of hacking I figured out that my first mistake was using the plugin builder that ships with the EDK 12.1.0.4, or more specifically using it to chose and define my metric definitions and collections.  The Plugin Builder defines the SNMP Fetchlet ID as "SNMP" it should be "Snmp" so after manually editing the targetType XML and redeploying my plugin it works like a treat.  make sure if you are using an SNMP fetchet that you define the QueryDescriptor as;
<QueryDescriptor FETCHLET_ID="Snmp">

Thursday, 15 January 2015

Enterprise Manager System Errors URL

In a time not too long ago there was product called Enterprise Manager Grid Control 10g.  This product has since morphed into what is now called Enterprise Manager Cloud Control 12c.

I was recently asked by someone to investigate a problem they had with their EM12c implementation and one of the things I wanted to look at was the repository errors.  These errors are stored within the SYSMAN schema in a table called MGMT_SYSTEM_ERROR_LOG, there is also a table called MGMT_SYSTEM_PERFORMANCE_LOG that makes for interesting reading but we'll cover that another time.

The data in these tables doesn't hang around forever, it is purged after 30 days.

In 10g this was easy you simply navigated to https://yourhost.com:7799/em/console/health/healthSystemError and you could filter the errors by various attributes.  This was one of the pages I visited regularly to make sure that my EM system was running smoothly, okay in 10g it had it's limitations, it only displayed the first 2000 errors (what the hell first 2000 errors I hear you gasp), yes that's right just the first 2000, but as I said you can filter the results.  My small test system has a mere 20 agents but it has generated approximately 2500 errors in the last 30 days.  Okay it's not as bad as it first seems there are only 185 distinct errors but nonetheless that's 185 things I could potentially have to do something about.

So the point of this blog.

Don't despair, the page is still there, it's not gone there is just no link to it from the current UI, you just have to know it exists and bobs your uncle ....